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Introduction Special Issue 
Risk Assessment 


INTRODUCTION: WHY ASSESSMENT 
“MATTERS” IN AN EVIDENCE-BASED 
COMMUNITY CORRECTIONS SYSTEM 
Every day, federal, state, and local commu- 
nity corrections officers are required to make 
decisions about the “risk level” posed by indi- 
viduals placed under community supervi- 
sion. Based on this risk assessment, offenders 
are assigned to a specific supervision level, 
where a variety of “tools and techniques” 
will be applied in an attempt to “manage” 
the risk posed by these offenders, at least for 
their time under community supervision. 
With roughly 100,000 probation and parole 
officers nationwide supervising an offender 
population of close to 5 million, it can cer- 
tainly be argued that risk assessment is the 
single most important decision made by pro- 
bation and parole officers today. With large 
caseloads and shrinking budgets, “triage” is 
the “name of the game” in the community 
corrections field; and assessment is a critical 
first step in any triage process. 

In this special issue of Federal Probation, 

a number of nationally known experts in 
the field of community corrections have 
been asked to provide their own assessments 
of this triage process, focusing specifically 
on the use of both clinical and actuarial 
risk assessments by community corrections 
officers. A number of questions about risk 
assessment are raised and answered in this 
issue, including: 

¢ What do we actually mean by risk? 

e How reliable and accurate are actuarial 
and clinical assessments? 

e What types of assessment instruments 
are available for adult offenders, juvenile 
offenders, and specific offender groups, 
such as sex offenders, drug offenders, 
and/or mentally ill offenders?, and per- 
haps most importantly, 

e What evidence exists that links initial 
(and ongoing) risk assessment (using 
actuarial and clinical assessment tools) 
with subsequent offender outcomes (i.e., 
the link between assessment and risk 
reduction)? 


JAMES ByRNE, GUEST EDITOR 


Department of Criminal Justice and Criminology 


As the authors of the articles in this spe- 
cial issue demonstrate, we currently know 
more about how to classify offenders into sev- 
eral categories of “risk” of re-offending than 
we know about how to reduce their “risk” of 
re-offending while under community super- 
vision. When viewed in this context, debates 
over the use of actuarial vs. clinical assess- 
ment tools tend to obscure a larger issue: Are 
we interested in short-term offender control 
or long-term offender change? I would argue 
that in an evidence-based community cor- 
rections system, a clear link needs to be 
established, beginning with 1) valid and reli- 
able initial (and ongoing) assessment, con- 
tinuing to 2) development of risk-specific 
community corrections interventions, and 
concluding with 3) identification of subse- 
quent offender outcomes in communities 
with different community risk levels. 

Because risk assessment is the “lynchpin” 
in this process, it is critical to the success 
of community corrections, both in terms 
of short-term offender control and long- 
term offender change. While much time and 
effort has focused on how to assess the risk 
level of individual offenders, far less research 
has been focused on the assessment of com- 
munity risk level (see, e.g. Pattavina, et al., 
2006). As we conduct further research on 
how different individuals (such as high- vs. 
low-risk offenders) respond to community 
supervision strategies in different commu- 
nities (such as high- vs. low-risk neighbor- 
hoods), we will take the logical next step in 
the development of an evidence-based com- 
munity corrections system. 

In the first article of this special issue, 
“Assessment with a Flair: Offender Account- 
ability in Supervision Plans,” Faye Taxman— 
one of this country’s leading experts on case 
planning in community corrections—argues 
that the first rule in evidence-based com- 
munity corrections practice is that servic- 
es (both treatment and control-based) to 
offenders should vary in intensity based 
on the risk level of offenders. She describes 
how the risk level of the offender should be 
considered: first in terms of the intensity 
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and appropriateness of services, and next in 
terms of the offender’s individual case plan, 
which should focus on the clinically-based 
assessment of criminogenic needs (substance 
abuse, family dysfunction, peer associates, 
criminal personality, antisocial cognitions, 
low self control, and other factors, including 
mental health). 

Patricia Harris expands on the notion 
that both actuarial and clinical assessments 
are essential components of an evidence- 
based community corrections system in her 
excellent review of both types of risk assess- 
ment in community corrections. Her article, 
“What Community Supervision Officers Need 
to Know About Actuarial Risk Assessment 
and Clinical Judgment,” reviews the empiri- 
cal research on the adequacy of both clinical 
and actuarially-based risk assessment instru- 
ments, and then identifies three impedi- 
ments to the full implementation of actuarial 
risk assessment in community corrections: 

e unclear/cursory presentations on the pur- 
pose of actuarial risk assessments to line 
staff; 

e¢ poor communication of offender risk 
assessment results; and 

e failure to recognize the importance of 
clinical judgment (and skill) in the (actu- 
arial) risk assessment process. 

The third article that focuses on the 
use of actuarial and clinical assessment 
tools—“Clinical versus Actuarial Judgments 
in Criminal Justice Decisions: Should One 
Replace the Other?”—is authored by two 
of the country’s foremost experts on risk 
assessment, Stephen Gottfredson and Laura 
Moriarty. They begin their review by offer- 
ing the following unequivocal assessment of 
the two competing approaches to risk assess- 
ment: “In virtually all decision-making situ- 
ations that have been studied, actuarially 
developed devices outperform human judg- 
ments.” While they argue that “there is a 
place for human judgment and experience in 
the decision-making process,” they offer the 
following caveat: over-reliance on human 
judgment may undermine the accuracy of 
the risk assessment, because probation and 
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parole officers may “concentrate on infor- 
mation that is demonstrably not predictive 
of offender behavioral outcomes.” 

While the three articles that have just 
been highlighted present an overview of 
existing research that has been conducted on 
both actuarial and clinical risk assessment 
instruments, the next three articles highlight 
the use of different types of assessment strat- 
egies for different types of offenders. First, 
Albert Roberts and Kimberly Bender exam- 
ine the most commonly used risk and mental 
health needs assessment tools in juvenile 
corrections in their uniquely titled article, 
“Overcoming Sisyphus: Effective Prediction 
of Mental Health Disorders and Recidivism 
among Delinquents.” The authors discuss the 
implications of their review for the two cen- 
tral goals of youth assessment in juvenile jus- 
tice settings: 1) the safety of the community, 
and 2) the rehabilitation (and clinical treat- 
ment) of individual juvenile offenders. [ Note: 
According to the “Myth of Sisyphus” webpage, 
“The gods had condemned Sisyphus to cease- 
lessly rolling a rock to the top of a mountain, 
whence the stone would fall back of its own 
weight. They had thought with some reason 
that there is no more dreadful punishment 
than futile and hopeless labor.” 

Next, Lurigio and Swartz identify and 
critically review the use of actuarial and 
clinical risk assessment tools for the large 
number of adult offenders in our commu- 
nity corrections system with serious mental 
illness (SMI). Focusing specifically on the 
instruments used to identify offenders with 
serious mental illness, Lurigio and Swartz 
describe the results of their recent research 
testing the accuracy of two “new generation” 
mental health screening tools in community 
corrections (the K-6 screening tool, and the 
Brief Jail Mental Health Screen). 

Moving from mentally ill offender assess- 
ment devices to sex offender risk assess- 
ment devices, Andrew Harris provides a 
comprehensive review of the research on 
the use of actuarially-based and clinically- 
based assessments of sex offender risk. Har- 
ris’s article underscores one limitation—at 
least to some—of a risk-driven community 
corrections system: If we made decisions 
about sex offenders based solely on actuarial 
risk assessment, then very few sex offenders 
would ever be placed under close supervision, 
because as a group, sex offenders recidivate at 
remarkably low rates. The Harris article rais- 
es important questions for community cor- 
rections managers to consider, not only about 
the accuracy of sex offender-specific assess- 
ment devices using either actuarial or clinical 
assessment instruments, but also about the 
interplay between the actual risk posed by 
various types of sex offenders and the stakes 
(for offenders, victims, probation/parole offi- 


cers, and the community) associated with 
decisions made on the appropriate level of 
control needed for this group of low-risk 
but high-stakes offenders. For sex offenders, 
over-classification appears to be an inevitable 
consequence of the “risk” assessment pro- 
cess, not because these offenders pose a high 
risk to recidivate, but rather because for this 
group of offenders, any risk—one in 10, one 
in 20 or even 1 in 100—is unacceptable due to 
the high stakes involved. 

One of the challenges facing community 
corrections managers at the federal, state, 
and local level is the development of a defen- 
sible risk assessment system. By defensible, 
I am referring to a classification system that 
is externally reviewed and objectively vali- 
dated. Two separate articles address this issue 
directly. First, Anthony Flores, Christopher 
Lowenkamp, Paula Smith, and Edward Lates- 
sa examine the predictive accuracy of the 
Level of Service Inventory-Revised (LSI-R) for 
a sample of 2,107 adult federal probationers, 
using subsequent incarceration (rather than 
the traditional re-arrest) as their outcome 
measure. Next, Susan Turner and Terry Fain 
present the results of their validation of the 
“Risk and Resiliency Checklist,” first devel- 
oped in San Diego and currently being used 
in the Los Angeles Probation Department 
as the centerpiece of the department’s new 
automated case planning system. The authors 
found that a youth’s resiliency score—the 
net sum of risk factors (which have negative 
values) and protective factors (which have 
positive values)—is significantly related to 
subsequent recidivism, using re-arrest during 
a 12-month follow-up period (from initial 
assessment) as the outcome measure. 

The final two articles in this special issue 
challenge much of the current thinking in 
the area of risk assessment and offer a differ- 
ent set of policy options for readers to con- 
sider. James Austin is one of this country’s 
foremost authorities on institutional and 
community-based classification systems. 
His article, “How Much Risk Can We Take? 
The Mis-use of Risk Assessment in Correc- 
tions” describes the six basic steps to follow 
in developing a risk assessment instrument: 
e Risk assessment instruments must be 

tested on your (own) correctional popula- 

tion and separately normed for males and 
females, 

An inter-rater reliability test must be 

conducted, 

A validity test must be conducted, 

The instrument must allow for dynamic 

and static factors, 

The instrument must be compatible with 

the skill level of your staff, and 

The risk instrument must have face valid- 

ity and transparency with staff, proba- 

tioners, parolees, and policy makers. 


Utilizing the concepts presented in his 
six-step risk assessment development model, 
Austin provides a critique of the best-known 
and most commonly used risk assessment 
instrument, the LSI-R, using illustrative 
examples from recent studies on the reli- 
ability and validity of the LSI-R that he 
conducted in Pennsylvania (parole board), 
Washington, and Vermont (community cor- 
rections). Austin concludes his review by 
identifying the “disconnect” that currently 
exists at the federal, state, and local level 
between an offender’s risk level and the 
availability of services: high-risk offenders 
are under-serviced, while low-risk offenders 
are over-serviced. As Austin suggests, “It 
would be helpful for those in the risk assess- 
ment business to start advocating a more 
reasonable level of intervention that matches 
the risk they have so carefully calibrated” 
(this volume). 

Byrne and Pattavina’s article, “Clinical 
and Actuarial Risk Assessment in an Evi- 
dence-based Corrections System: Issues to 
Consider,” concludes the special issue by 
identifying three topics central to the cur- 
rent debate over the use of actuarial and 
clinical risk assessment in community cor- 
rections: 1) the need to distinguish between 
risk assessment and risk reduction, 2) the 
dummying down of community corrections 
associated with the development of actuarial 
risk assessment instruments, and 3) the need 
to combine individual risk assessment and 
community risk assessment in the next gen- 
eration of risk-driven supervision strategies. 

Taken together, the ten articles included 
in this special issue of Federal Probation 
offer readers an opportunity to examine 
the empirical evidence on the application 
of actuarial and clinical assessment instru- 
ments in community corrections systems, 
and to consider the new wave of assessment 
instruments being developed for adult and 
juvenile offenders and for specific subgroups 
of offenders (mentally ill offenders, substance 
abusers, sex offenders). Each author raises 
challenging issues for policy makers, practi- 
tioners, and the public to consider regarding 
the proper role of assessment in an evidence- 
based community corrections system. 

What “predictions” can be offered about 
the direction of the field? In the very near 
future, I suspect that the current focus on the 
reliability and validity of risk classification 
systems will be supplanted by discussions of 
how to improve treatment classification sys- 
tems in both institutional and community 
corrections. When this occurs, it will likely 
be the result of an emphasis on offender 
change—rather than short-term offender 
control—as the primary purpose of our 
community corrections system. 
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RULE NUMBER ONE in EBP (evidence- 
based practice) is that high-risk offenders 
should be placed into appropriate treatment 
services, and that low- and moderate-risk 
offenders should not receive the same inten- 
sity of services. (Note: The use of the term 
“services” here includes both treatment and 
control techniques.) While this may seem 
like a simple concept, it encompasses the 
following: 1) use actual risk assessments; 2) 
use dynamic criminogenic needs; 3) adopt 
responsivity or matching strategies to link 
offenders to services and controls; and 4) 
administer heterogeneous programs that 
address the myriad of offender issues. The 
goal is to combine all of these together as 
a supervision plan that identifies the goals 
and specifies expectations for the offend- 
er. These expectancies become the binding 
agreements that define the criteria for being 
successful on supervision. Stated simply, 
assessment is not just a stand-alone process; 
instead, it is a process that should lead to the 
goal of a supervision plan that is designed to 
change the behavior of the offender. 
Assessment should be the beginning of 
the correctional process. Of course in con- 
temporary criminal justice practice, it can 
occur at a number of points, including arrest 
and pretrial detention, sentencing, intake 
to probation/parole or prison, and so on. 
In other words, it can occur in numerous 
places, all with slightly different goals—at 
pretrial to determine risk of flight or danger 
to society, at sentencing to determine the 
appropriate punishment and/or placement, 
at prison to determine security levels, and 
at probation/parole to determine risk to the 
community. In all of these calculations, the 


goal of the assessment is to inform deci- 
sions about the degree of restrictions that 
an offender should be given based on the 
offender’s history and seriousness of the cur- 
rent offense. The assessment can also con- 
tribute to what is traditionally referred to as 
a “treatment plan,” or more specifically the 
corrective action plan to help the offender 
become a productive citizen and contribut- 
ing member of society. As noted recently by 
Ed Latessa and his colleagues (2002), correc- 
tions practice today seldom ties the assess- 
ment to a plan for the offender. Instead the 
plan for the offender is generally made based 
on judicial or parole board decisions. 

The most frequent stumbling block is an 
understanding of the core elements that are 
embedded in EBP Rule #1, and how to apply 
these elements in practice. That is, with the 
tools that are available, often there is a mis- 
understanding of the concepts of risk and 
needs. Often the terms “static” and “dynam- 
ic” are inappropriately interchanged with 
risk and needs. Risk refers to the actuarial 
(or statistical) likelihood that an offender 
will have further criminal behavior. Dynam- 
ic refers to the dimensions of the person’s 
functionality that, if improved, can affect 
their involvement in criminal behavior. A 
clarification of these concepts is the main 
goal of this paper, with an eye on trying to 
clarify how best to use these concepts in cor- 
rectional practice. 


The Standardized Risk and 
Needs Assessment Tool 
Dilemma 

Don Andrews and his colleagues (2006) 
recently provided a historical review of the 


Faye Taxman, Ph.D. 
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concept of objective assessment tools for the 
criminal justice system. The review detailed 
the generational development of assessment 
in the various phases of the criminal jus- 
tice system over the last 80 years: clinical 
assessments of offenders’ risk to the com- 
munity with some emphasis on treatment 
planning, actuarial risk assessment to assess 
the likelihood of further criminal behavior, 
actuarial risk assessment combined with 
dynamic variables to better guide treatment 
planning, and actuarial risk assessment tools 
supplemented by problem-specific tools. The 
development of standardized tools for the 
field has accompanied various needs in the 
criminal justice system, including classifica- 
tion, treatment planning, release decisions 
(from prison, jail, or parole), and sentenc- 
ing. Essentially, assessment tools have been 
developed and used for various purposes, 
which adds to the complication of how to use 
the tool(s). Some tools are designed merely 
to identify risk factors related to certain 
decisions, while others are designed to iden- 
tify the factors or needs that, if altered, an 
improve offender outcomes. 

As is true for other fields, and as other 
articles in this issue of Federal Probation 
note (see papers by Austin and Harris), a 
major point of discussion in the criminal 
justice field has been the value of standard- 
ized assessment tools compared to that of 
subjective assessments by counselors and 
other correctional staff. The preference for 
subjective assessment is a long-standing 
issue in the field (as well as in psychology, 
education, and other disciplines), since pro- 
fessionals feel confident in their decision- 
making skills, and do not want to. succumb... 
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to a paper-pencil test. But as discussed by 
Harris (2006) in another article of this edi- 
tion, research persists in demonstrating that 
standardized objective tools enhance deci- 
sion making, besides providing institutional 
safeguards against discretionary, biased, or 
inappropriate decisions. The use of stan- 
dardized tools minimizes the potential for 
bias to be introduced into the decision mak- 
ing process by such human factors as the 
staff person being influenced by the dress, 
mannerisms, and/or attitude of the offender, 
in addition to such obvious factors as gender, 
class, and race. 

Of course, the clinical vs. objective tool 
debate is an overstatement of the relative 
advantage of an interview with the offender. 
Third and fourth generations of assessment 
tools are accompanied by an interview (clini- 
cal in nature with “clinical” referring to an 
interview to collect information from the 
offender in a manner conducive to assessing 
the offender’s risk and needs). The purpose 
of the interview is to gather key information 
on key domains and then use that informa- 
tion to evaluate the offender's responses in 
comparison to the official record (e.g., arrest 
records, crime report, treatment history etc.). 
That is, risk assessment with dynamic factors 
or the latest generation of risk assessment 
accompanied by specialized tools (such as 
drug screeners, mental health screeners, etc.) 
requires a clinical interview to obtain and 
assess information from the subject. A good 
assessment process requires interviewing the 
offender, which allows the criminal justice 
professional to gather, collect, and evaluate 
the offender’s responses along with other 
information obtained in official records. And, 
as promulgated by Taxman and colleagues 
(2004) and Taxman (2004), an important 
part of the interview process is engaging the 
offender in processing his/her own responses 
to the interview questions as part of a process 
of engaging the offender in becoming more 
accountable for his/her behavior. 

Lowenkamp, Latessa, and Holsinger 
(2006) found that many offenders were not 
screened for actuarial risk before being 
placed in community correctional programs 
in Ohio, and that reductions in recidivism 
were noticeable for high-risk offenders in 
correctional programs that tended to be 
multi-dimensional and primarily served 
high-risk offenders. The authors developed 
an actuarial risk tool that focused only on 
the offender’s static risk factors (prior arrests, 
prior incarceration, age at current arrest, 


employed at arrest, history of failure in 
community correctional programs, drug use 
history). In a series of articles and presenta- 
tions, they have reported the same results 
for offenders placed in residential programs, 
intensive supervision programs, and other 
correctional programs in Ohio. Using a 
quasi-experimental design, the researchers 
illustrate that reductions in recidivism are 
possible by using standardized risk tools, 
which help to ensure that high-risk offenders 
receive the more structured services. Their 
research also illustrates how poor classifica- 
tion schemes can result in over-classifying 
offenders (i.e., placing low-risk offenders 
in inappropriate programs) and only serve 
to increase the recidivism of this group of 
offenders. Their research basically supports 
Rule #1 of EBP regarding the importance of 
actuarial risk tools. This is the recent addi- 
tion to a long-standing support for this con- 
cept from individual studies and also, more 
importantly, from recent meta-analyses (see 
meta-analyses such as Andrews, Bonta, and 
Hoge, 1990, Andrews et al., 1990; Lipsey & 
Wilson, 1998; Gottfredson, Najaka, & Wil- 
son, 2001; Wilson, Lipsey, & Derzon, 2003). 
Research studies of late have shown that 
the field is struggling with how best to use 
the concepts of risk and needs in criminal 
justice decisions, and particularly on how 
best to integrate dynamic or need factors. 
A series of articles in the 2006 Crime and 
Delinquency (edited by me and Doug Mar- 
lowe) illustrate how this struggle occurs. 
Taxman and Thanner (2006) detail how a 
randomized trial to examine the efficacy 
of a seamless probation-treatment protocol 
was affected by the classification of offenders 
as drug-involved. Offenders were assessed 
using an actuarial risk tool in one stage of 
the experiment and then a clinical assess- 
ment was conducted to determine drug use. 
Using the standard DSM-IV criteria (Diag- 
nostic Statistical Manual IV-TR), a clinician 
assessed the offenders to be drug abusers. 
(DSM-IV states the accepted criteria for 
abuse and dependency.) In this experiment, 
half of the offenders were classified as high 
risk and half as moderate risk. However, 
few of the offenders in either the high risk 
or the moderate risk categories could be 
classified as drug dependent by the DSM-IV 
criteria. (The intervention involved a cogni- 
tive behavior treatment that was geared for 
offenders with drug problems.) Study find- 
ings indicate that the seamless system had 
no impact overall, but analysis found that 


the seamless system had a positive effect 
on high-risk and drug-dependent (addicted 
or serious abuse problems) offenders. In 
this study, the clinician did not use a stan- 
dardized tool to assess for a drug problem, 
which resulted in overclassifying offenders 
as drug users when in fact many would not 
have met that criteria if a standardized tool 
was used. Another article in this edition by 
DeMatteo, Festinger, and Marlowe (2006) 
found that in many drug courts numer- 
ous offenders are not drug dependent and 
had generally low-threshold drug use (they 
were nevertheless classified as drug offend- 
ers largely due to their involvement in the 
legal system, which is one of the criteria 
for being classified as an abuser). Yet, these 
offenders are asked to participate in a highly 
structured program and required to go to 
drug treatment services. Not surprisingly, 
the drug courts do not tend to demonstrate 
reductions in recidivism. 


Clarifying the Concepts of Risk 
& Dynamic 

Risk and needs vs. static and dynamic? The 
third and fourth generation tools combine 
two concepts into one instrument or pro- 
tocol. The two concepts are: 1) that actu- 
arial risk factors can be used to determine 
the degree to which the offender's history 
predicts that he/she is likely to be a risk in 
the community or in a prison setting (i.e., the 
past predicts the future notion); and 2) that 
needs or psycho-social factors that should be 
ameliorated or addressed can be identified to 
reduce the risk for further involvement in the 
criminal justice system. The combination of 
these two concepts into one instrument or a 
cascading model (using screeners to deter- 
mine the need for more in-depth inquiry 
into a problem area based on the results 
from the screener, such as fourth generation 
instruments include) evolved from the needs 
of the criminal justice system for better clas- 
sification and treatment placement tools. The 
researchers constructed the Wisconsin Risk 
and Needs Tool to allocate service resources 
accordingly (much like a triage approach, 
where high-risk offenders would receive the 
scarce resources first to prevent harm). This 
resource allocation tool was constructed on a 
management-model premise. 

The field has had a difficult time learning 
to use these tools in a manner that would 
facilitate the intended purpose—Rule #1. 
Again, the intended purpose is to isolate 
the criminal drivers while keeping in mind 
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the actuarial risk. Criminal drivers refer to 
the people, places, or things that affect an 
individual’s involvement in criminal behav- 
ior. This means that the current status of an 
individual in areas that may in the past have 
been a problem may not be as relevant as 
other areas. Since many of these instruments 
use dichotomous (yes/no) responses or three 
categories of responses (none, some, many), 
practitioners are often left wondering how to 
select the drivers from other precursors. And, 
since many behaviors are intertwined, such as 
co-occurring disorders of mental health and 
substance abuse problems, the practitioner 
needs to determine which factors should be 
addressed as part of the criminal justice sys- 
tem, and which factors may be important for 
the person to address in the greater scheme of 
his or her overall health and well-being, but 
do not necessarily need to be encompassed in 
the criminal justice system. 

Many attitudes, values, and behaviors lie 
on a continuum of “no problem” to “severe 
problem” behavior. This is important to 
keep in mind, because most human beings 
exhibit certain negative traits, but it is the 
degree to which these traits influence the 
subject’s involvement in negative behavior 
such as crime or drug use that concerns the 
criminal justice system. In this case, the 
negative influence is one that predisposes an 
individual to engage in certain acts, or hold 
certain values or attitudes that they tend to 
hold when engaging in behaviors that are 
covered by the criminal laws. 


Risk Factors: Actuarial 


In determining actuarial risk, we measure 
behaviors that predict negative outcomes 
(increased risk for criminal behavior) (e.g. 
the concept of predictive validity). The actu- 
arial risk generally refers to demographic 
or historical factors (past behaviors) that 
affect the trajectory of an individual. For 
example, age of first arrest (or incarcera- 
tion) is a predictor of further involvement 
with the criminal justice system, since the 
earlier an individual has been involved in the 
criminal justice (or juvenile justice) system, 
the greater the likelihood of future involve- 
ment. The actuarial concept in criminal 
justice is similar to that used in assessing 
risk factors for health insurance (e.g. family 
history, age of onset of a disorder, number of 
occurrences, etc.) or car insurance (e.g. prior 
driving history, speeding violations, etc.). 
As discussed by Gottfredson and Moriarty 
(2006), the statistical methods and method- 


ology for developing these tools are sound. 
The emphasis is placed on criminal behav- 
ior, and the historical factors that predict 
the likelihood that an offender will continue 
criminal conduct. 

The key question is the criterion variable 
or the behavior that is being predicted. In 
traditional criminal justice literature, the 
criterion variable is new criminal behavior 
(as measured by new arrests or reincarcera- 
tion). Yet, many proxies that may be used in 
a risk assessment may not be direct measures 
of criminal behavior. Examples are substance 
use (except the tautology that use of illicit 
substances is a criminal act) or other victim- 
less crimes (e.g. prostitution, etc.), technical 
violation for failures on probation and/or 
parole supervision, and so on. Clarifying this 
concept is important to differentiate whether 
the behavior being predicted is actual fur- 
ther criminal behavior. It should be noted 
that heightened law enforcement activities 
(arrests) in some geographic areas (which 
increases the odds for arrest) may influence 
certain variables. This is why some research- 
ers are focused on certain classes of behavior 
(e.g. property crimes, violent crimes, etc.) 
that are less susceptible to the neighborhood 
context that an individual resides in. 


Dynamic Factors: Criminal 
Drivers 


The third and fourth generation assessment 
tools include questions about dynamic fac- 
tors, or psycho-social needs that, if unad- 
dressed, tend to increase the risk that the 
individual will commit criminal acts. That is, 
while many of these factors may be present in 
most human beings, it is the degree to which 
they influence an individual’s daily func- 
tionality that determines the degree to which 
they affect the offender’s behavior (criterion 
validity). The important component is that 
these need factors also predict the likelihood 
that an individual will become involved in 
criminal behavior due to the impact on 
the offender’s current behavior. Researchers 
have found that certain domains are more 
likely to negatively impact an individual, 
whereas other domains that we might think, 
using common-sense, have the same impact 
(e.g., mental health status, low educational 
status, or underemployment) are not directly 
related to criminal conduct. 


Substance Abuse 


Frequently the statement is made that over 
70 percent of offenders are drug involved. 
This statement derives from reports regard- 
ing how many in the offender population 
report some use of illicit substances during 
their lifetime (or the lifetime prevalence). 
This statement many not refer to current 
use or use that is associated with dysfunc- 
tional behavioral. Using a clearer definition, 
researchers have generally found that about 
35 to 50 percent of offenders have substance 
abuse patterns that require drug treatment 
(about one third of males, about half of the 
females) (Belenko & Peugh, 2005; Taylor, 
Fitzgerald, Hunt, Reardon, & Brownstein, 
2001). The drug-crime nexus literature is a 
complex web that does not illustrate any cau- 
sality between drug use and other criminal 
behavior, except for heroin or crack addicts, 
where the literature is clearer cut. The alco- 
hol-crime nexus also is convoluted (besides 
the tautology of alcohol consumption in 
public, etc.), and just like the drug-crime lit- 
erature, the relationship between substance 
use and crime depends upon the nature of 
the use and situation. 

Table 1 illustrates the criteria for abuse 
and dependency accepted in the field (APA, 
2004). The DSM-IV criteria differentiate 
between use and abuse, both of which are 
defined by the degree to which the use 
(abuse) affects the person’s daily functions. 
The literature on drugs and crime is most 
clear cut about the impact of providing 
treatment services for drug-dependent her- 
oin and crack addicts—providing treatment 
will reduce recidivism and substance abuse. 
Based on this literature, it is suggested that 
it is important to identify drug-dependent 
addicts and then place them in appropriate 
treatment services. The priority should be 
given to targeting high-need (i.e. dependent) 
substance abusers for appropriate services. 
It should also be noted that those involved 
in the career business (i.e. entrepreneurs 
or those that are involved in dealing, etc.) 
may be classified as abusers when in fact 
their criminal behavior is linked to the busi- 
ness, and not to the drug use. While many 
involved in the business of drug dealing are 
also “dabblers” or users of small quantities 
of substances, their overall use is generally 
not due to compulsive behavior but rather to 
opportunity structures. 
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TABLE 1: 


DSMIV-TR Criteria for Substance Abuse for Dependency 


CRITERIA FOR SUBSTANCE ABUSE 


during a 12 month period: 


wh 


from its effects 


substance use 


A pattern of substance use leading to significant impairment or distress, as manifested by one or 
more of the following during in the past 12 month period: 


1. Failure to fulfill major role obligations at work, school, home such as repeated absences or 
poor work performance related to substance use; substance-related absences, suspensions, 
or expulsions from school; neglect of children or household 


2. Frequent use of substances in situation in which it is physically hazardous (e.g., driving an 
automobile or operating a machine when impaired by substance use) 


3. Frequent legal problems (e.g. arrests, disorderly conduct) for substance abuse 


Continued use despite having persistent or recurrent social or interpersonal problems (e.g., 
arguments with spouse about consequences of intoxication, physical fights) 


CRITERIA FOR SUBSTANCE DEPENDENCE 


Dependence or significant impairment or distress, as manifested by 3 or more of the following 


1. Tolerance or markedly increased amounts of the substance to achieve intoxication or desired 
effect or markedly diminished effect with continued use of the same amount of substance 


Withdrawal symptoms or the use of certain substances to avoid withdrawal symptoms 
Use of a substance in larger amounts or over a longer period than was intended 
Persistent desire or unsuccessful efforts to cut down or control substance use 
Involvement in chronic behavior to obtain the substance, use the substance, or recover 


6. Reduction or abandonment of social, occupational or recreational activities because of 


7. Use of substances even though there is a persistent or recurrent physical or psychological 
problem that is likely to have been caused or exacerbated by the substance 


Source: APA, 1994. 


Family Dysfunctional 


Family disarray and histories are generally 
precursors to learned behaviors—some are 
negative such as drug use or criminal behav- 
ior. Within this context, people learn atti- 
tudes, values, and behaviors. Differences exist 
in how families affect the behavior of men and 
women based on the degree of dysfunction in 
the family. For men, the stress from the family 
is to be a contributor (financial and other- 
wise) or to play a major role in the family. For 
women, the stress from the family is to be a 
caretaker or to be subservient to males in their 
lives. To obtain the support that is needed 
from the family, the offender is susceptible to 
responding to the pressure through criminal 
behavior (or drug use). The issues regarding 
the family are complex, in that the household 
may allow and tolerate certain behaviors in 
the home, including substance use or criminal 
behavior. And, the family could have expecta- 
tions that the offender feels unable to meet. 


Peer Associates 


The other (and sometimes more influential) 
support mechanism that many rely upon 
(non-familial) generally consists of peers or 
associates. The risk factor is that the offender 


this reinforces the criminal behavior. Over 
time, the offender essentially loses contacts 
with prosocial or non-criminal individuals. 
In other words, the offender fails to main- 
tain the social support network that supports 
mainstream behaviors (prosocial). This is 
not just an issue of whether the offender is 
involved in a gang but rather whether the 
offender has any close associates that are not 
connected to criminal behavior. The ques- 
tion here is the degree to which the offender 
relies upon the peers that are involved in the 
criminal justice system and whether any of 
the associates are non-criminally involved. 


Criminal Personality 


Using the DSM-IV criteria, antisocial person- 
ality disorder (ASPD) and impulsive behaviors 
are part of the composite of personality disor- 
ders. According to the DSM-IV, approximately 
3 percent of men and 1 percent of women have 
some form of antisocial personality. As shown 
in Table 2 below, the antisocial personality dis- 
order is characterized by a callous unconcern 
for the feelings of others, gross or persistent 
attitude of irresponsibility and disregard for 
social norms, rules, or obligations, incapac- 
ity to maintain enduring relationships, low 
tolerance for frustration and low tolerance 


“associates with others in a like situation, and 


for use of aggression or violence, incapacity 


to experience guilt or to profit from experi- 
ence, or marked proneness to blame others for 
the behavior that the offender exhibits. This 
personality disorder differs from psychopathy, 
which is a more callous version of an ASPD, 
and some states have developed legal or judicial 
definitions of what constitutes a psychopath. In 
terms of the medical definition (according to 
the DSM-IV), a psychopath is defined as hav- 
ing no concern for the feelings of others and a 
complete disregard for social obligations. The 
psychopath is generally considered callous and 
incapable of forming lasting relationships; the 
psychopath lacks empathy, remorse, anxiety, 
or guilt and tends to be devoid of conscience. 
Psychopaths are the extreme criminal person- 
ality. A proper diagnosis requires clinical skills 
as well as standardized tools (see the Hare's 
Psychopath Checklist, Hare, 1990). 


TABLE 2: 
DSMIV-TR Criteria for Antisocial 
Personality Disorder (ASPD) 


Criteria for Antisocial Personality Disorder 


Antisocial personality disorder is defined 
as a pervasive pattern of disregard for and 
violation of the rights of others since the 
age of 15 as indicated by three or more of 
the following: 


1. Failure to conform to social norms 
with respect to lawful behaviors 


2. Deceitfulness (repeated lying) or 
use of aliases or conning others for 
personal profit or pleasure 


Impulsivity or failure to plan ahead 


4. Reckless disregard for safety of self or 
others 


5. Consistent irresponsibility, as indicated 
by failure to sustain work and honor 
financial obligations 


6. Lack of remorse, as indicated by being 
indifferent to or rationalizing having 
hurt, mistreated, or stolen from another 


Source: APA, 2004 


Antisocial Cognitions (Attitudes/ 
Orientation)/Thinking 


As distinct from the personality disorder is the 
attitudes and cognitions of offenders. Yochel- 
son and Samenow (1976) in their seminal work 
identified 36 thinking errors that they believed 
are used to shun responsibility, at least as 
defined by society’s standards. The continuum 
of criminality is from responsible to irrespon- 
sible; under “irresponsible” behavior, there is a 
range from nonarrestable to arrestable behav- 
ior. The scholars contend that all individuals 
have these kinds of thinking errors; however, 
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criminals exhibit more of them, and they tend 
to be on the irresponsible end of the continuum 
in many of the areas. The thinking error phe- 
nomenon gained further steam with the work 
by Walters (1990) and his colleagues, in which 
they developed eight subscales to measure 
criminal thinking; Walters’ work influenced a 
new tool on criminal thinking errors (Knight, 
Garner, Simpson, Morey, & Flynn, 2006). Most 
of this work has been criticized because the 
tools to measure criminal thinking have not 
been validated on a non-offender population, 
and therefore it is unclear whether these char- 
acteristics are concentrated in offenders or dis- 
tributed among the general population. Also 
many of the “thinking errors” are common 
defense mechanisms that are used by human 
beings to handle situations. 

The typical thinking errors include domi- 
nance, entitlement, self-justification, displacing 
blame, optimistic perceptions of realities, and 
“victim stance” (e.g. blaming society because 
they are considered outcasts). As noted by Mark 
Lipsey and Nana Landenberger (2006), such 
“distorted thinking may misperceive benign 
situations as threats (e.g., be predisposed to 
perceive harmless remarks as disrespectful 
or deliberately provocative), demand instant 
gratification, and confuse wants with needs” 
(p. 57). The issue about attitudes and orienta- 
tion is that the focus is on how the offender 
processes and interprets information. 


Low Self Control 


Impulsive and risk-taking behavior is anoth- 
er dynamic characteristic of offenders. The 
general premise is that low self control does 
not define criminal behavior; instead, it pro- 


vides a context for criminal acts depending 
upon opportunities and other motivating 
factors. A person’s decision to engage in 
criminal acts is affected by other factors such 
as natural constraints, attachments to par- 
ents, school, employment, and so on (Gott- 
fredson and Hirschi 1990, p. 95-97). Low self 
control is exhibited by the offender being 
easily persuaded by situational and envi- 
ronmental factors, and without attachments 
there is little to constrain the individual. 


Mental Health, Self-Esteem, 
Low Educational Attainment, 
Employment & Other Factors 


Mental health status, self-esteem, low educa- 
tional attainment, low employment options, 
and other factors are frequently discussed 
in the realm of criminogenic needs. The 
definition of a criminogenic need is that 
the factor predicts criminal behavior, and 
the research literature does not demonstrate 
that the presence of these attributes predicts 
recidivism or involvement in criminal behav- 
ior. Rather, low educational attainment and 
unemployment appear to be correlated with 
the offender population, which leads some 
to conclude that addressing these factors 
may also reduce recidivism. As discussed 
previously with other substance abuse and 
ASPD, the behaviors range on a continuum. 
The same is true with mental health disorder, 
where the problems range from anxieties or 
depression to erratic and/or risky behavior 
(e.g. hears voices or expresses disorganized, 
disoriented, or paranoid thoughts; appears 
lethargic and sad; unusually manic in behav- 
ior, etc.). However, it is generally recognized 


that in order to improve an offender’s well- 
being (which may not be related to recidivism 
reduction efforts), he or she would benefit 
from improved employment, educational, 
and mental health status(es). Ultimately, 
addressing these issues may affect the ability 
of the individual to be a contributing member 
of society and/or family; it is unclear whether 
addressing these factors in and of themselves 
will affect criminal behavior. 


Applying Rule #1 in 
Correctional Agencies 

Exhibit 1 belowillustratestheimplementation 
of these principles into a model. Essentially, 
actuarial risk level should be determined to 
identify what is the offender's likelihood of 
further criminal behavior. High-risk offend- 
ers should be targeted for treatment based 
on the area (s) in which they score moderate 
or high on criminogenic needs. That is, the 
offender needs to be assessed also on the 
criminogenic needs to identify the drivers 
to their criminal behavior. The notion is 
that, similar to treatment placement models, 
actuarial risk should drive the priority for 
intensive control and appropriate services, 
with a focus on selecting programs that 
address multiple problem areas. “Appropri- 
ate” refers to attention to the criminogenic 
factors that have been identified. 

The model presented in the exhibit illus- 
trates how the criminogenic factors can exist 
regardless of risk level. That is, a substance 
abuser may be low risk due to the fact that 
he or she does not have a history in the 
criminal justice system. Other criminogenic 
factors may exist in that low-risk person, but 
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they are more likely to be low to moderate 
in severity. As the offender moves along the 
continuum of risk (moderate to high), then 
it is more likely that more severe problem 
behaviors may occur. This is a byproduct of 
the offender's inability to be a productive, 
contributing member of society. For example, 
a high-risk offender may have criminogenic 
needs relating to self control, peer associates, 
ASPD, and substance abuse. The combined 
treatment and control strategies should be 
designed to address these issues. The model 
also suggests that the high-risk offender is 
more involved in situations, settings, and 
individuals that are likely to further their 
criminal conduct. Hence, control and treat- 
ment services should be concentrated on 
this individual to achieve the desired goal of 
reducing the risk of recidivism. 


Conclusions 


The purpose of this article is to further 
elaborate on Rule #1 in Evidence-based prac- 
tices to better illustrate the concepts and to 
define criminogenic needs in the context of 
risk level. This article is driven by the needs 
of the field to translate the principle into 
operational terms. An actuarial-based risk 
screen is important to determine the degree 
to which offenders should be given services 
and resources to ameliorate criminal behav- 
ior. The type of services is determined by 
how the offender “scores” or presents on 
several criminogenic areas. Those offenders 
with high criminogenic needs, particularly 
those that are high or moderate risk, should 
be given services to ameliorate the crimino- 
genic need, which should reduce the risk for 
recidivism. Exhibit 1 conceptually presents 
the framework underscoring EBP #1; the 
challenge to organizations is to implement 
this principle. 

The field faces several challenges relating 
to organizational stamina in implement- 
ing Rule #1 by following this core concept. 
The first challenge is the willingness of the 
organization to focus services on high-risk 
offenders, which generally means that mod- 
erate- or low-risk offenders should not be 
given such services. Minimizing the provi- 
sion of services for low-risk offenders essen- 
tially results in decisions that the probation 
supervision should minimize the disruption 
from prosocial behaviors, since they are 
likely the glue that is preventing the offender 
from becoming criminally involved. Anoth- 
er factor is that the case plan/supervision 


evplan.should be driven by the goal to ame- 


liorate the criminogenic drivers. This is 
critical, since it provides the formula for 
reducing the risk of offenders in the com- 
munity. Exhibit 1 illustrates that when an 
individual is identified as having moderate 
or high criminogenic needs, then the plan 
should be to address these criminal drivers 
in the case plan. That is, the results of the 
assessment are directly relevant to the com- 
ponents of the case plan, because it provides 
an avenue to assist the offender in attending 
to issues that are relevant to his or her life. In 
short, EBP #1 challenges the organizations 
to redo case plans so that they address the 
drivers (criminogenic needs) that are more 
pertinent to the situational factors of the 
offenders. In so doing, case plans become 
the glue for the offender that addresses the 
risk factors. 
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COMMUNITY SUPERVISION OFFI- 
CERS as well as academics whose expertise 
includes offender classification may be sur- 
prised by the title of this article; after all, 
actuarial risk assessment tools have been 
available to the field of corrections in one 
form or another (e.g., Baird, Heinz, & Bemus, 
1979; Hoffman, 1994; Nuffield, 1982) for 
over 25 years and reliance on actuarial risk 
prediction is now a fundamental precept of 
widely promoted evidence-based practices 
(Bogue, Campbell, Carey, Clawson et al., 
2004). On the other hand, recent research 
on classification practices in community 
supervision indicates that support for actu- 
arial risk assessment is far from universal. 
According to a national survey reported 
in 2001, roughly one-quarter of probation 
and parole agencies and approximately 44 
percent of community corrections treatment 
providers contacted had not yet incorpo- 
rated standardized risk assessment tools into 
supervision practice (Hubbard, Travis, & 
Latessa, 2001). In a later survey conducted 
by the National Institute of Corrections 
(2003) of 74 public community corrections 
agencies exercising jurisdiction over more 
than half of the probationers and parolees 
under supervision in the U.S., respondents 
expressed concerns about the accuracy of 
the instruments and their capacity to really 
measure offender risk. 


Anyone who has worked with or studied 
probation and parole officers recognizes that 
skepticism about actuarial assessment, or 
alternately, belief in the supremacy of profes- 
sional judgment about an offender's likeli- 
hood of new criminal activity, is prevalent 
in community corrections, even in agencies 


that have adopted “state of the art” tools. 
This article addresses three factors that may 
be interfering with greater investment by 
community supervision officers in actuarial 
risk assessment. The first is overconfidence 
in the influence of perfunctory discussions 
of the clinical versus actuarial debate. Due 
perhaps to the longstanding availability of 
statistical assessment tools in corrections 
practice, training in actuarial risk assess- 
ment emphasizes historical overviews of the 
development of actuarial prediction (i.e., the 
“generations” of assessment techniques), at 
the expense of more persuasive explanations 
of why or under what conditions so-called 
clinical or professional judgment yields 
less accurate predictions than empirically 
derived tools. A second potential impedi- 
ment concerns shortcomings in risk com- 
munication and understanding; a growing 
body of research indicates that decision- 
makers’ acceptance and utilization of assess- 
ment results may depend on precisely how 
offender risk is summarized. The third is 
the tendency to portray actuarial and clini- 
cal assessment methods as a black and white 
debate, thereby failing to recognize and 
affirm the importance of clinical judgment 
and skill to the successful execution of actu- 
arial risk assessment. 


Research on Clinical Versus 
Actuarial Predictions of 
Offender Risk 


Studies that specifically address the effi- 
cacy of clinical judgment relative to actuarial 
assessment can be sorted into two categories. 
The first consists of research that compares 
the prediction accuracy of unstructured clin- 
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ical judgments with that of actuarial tools. 
The second includes studies that compare 
structured clinical judgments with actu- 
arial tools. Referred to simply as “clinical 
judgments” in most literature on this topic, 
unstructured clinical judgment involves 
the exercise of educated intuition, where 
information items gleaned from interviews, 
client history, psychometric instruments, 
and conferences with other professionals are 
engaged at the discretion of the individual 
carrying out the assessment (Meehl, 1954). 
In contrast, structured clinical judg- 
ment refers to use of scores from formal 
instruments that either were designed and 
validated for purposes other than predic- 
tion of recidivism (and whose individual 
items may or may not exhibit correlations 
with recidivism outcomes), or that were 
developed to predict recidivism but which 
incorporate items selected for their plausi- 
bility (such as may be suggested by relevant 
literature) rather than as a result of statistical 
research during the creation of the instru- 
ment in question. An example of the former 
is the Hare’s Psychopathy Checklist-Revised 
(PCL-R), created solely for the purpose of 
measuring psychopathy, a clinical construct 
(Hare, 2003). Examples of the latter include 
the Historical-Clinical-Risk Management- 
20 (HCR-20; Webster, Douglas, Eaves, & 
Hart, 1997) and the Multifactorial Assess- 
ment of Sex Offender Risk for Recidivism 
(MASORR; Barbaree, Seto, Langton, & 
Peacock, 2001). Douglas and Kropp (2002) 
prefer the label structured professional judg- 
ment over structured clinical judgment, in 
recognition of the mumerous nonclinical 
professionals (such as probation officers and 
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victim services personnel) who engage in 
risk prediction activities. 

The key difference between actuarial and 
any clinical risk assessment is that the for- 
mer includes only items known to correlate 
with outcome variables, as determined by 
statistical analysis of representative samples 
of cases followed up over fixed periods of 
time, such as three years. Depending upon 
the instrument, these items may also be 
weighted or otherwise subjected to math- 
ematical manipulation and ultimately com- 
bined to optimize prediction accuracy. 
Higher scores on actuarial tools correspond 
to higher probabilities of reoffending. 


Unstructured clinical judgment versus 
actuarial risk prediction 


Most conclusions about the inferiority of 
clinical predictions relative to actuarial ones 
are based upon research about unstructured 
clinical judgments. In the area of offender 
risk, the majority of these comparisons focus 
on the decision to release mentally disor- 
dered or violent offenders from psychiatric 
institutions, and their subsequent behaviors 
upon entry into the community. Researchers 
then assess which of the two methods, clini- 
cal or actuarial, better predicted both success 
and failure (e.g., reoffending) during a fol- 
low-up period. Examples of research produc- 
ing results favorable to actuarial assessment 
include Quinsey and Maguire (1986) and 
Gardner, Lidz, Mulvey, and Shaw (1996). 

Less common is research targeting more 
general populations of offenders, but again, 
analyses focus on success of institution- 
al release decision-making. Wormith and 
Goldstone (1984), for example, examined the 
relative importance of subjective judgment 
variables (such as employment plans, prog- 
nosis upon release, and police recommenda- 
tions) and static variables (such as criminal 
history, offense type, prior supervision out- 
comes, and demographic characteristics) in 
the statistical prediction of rearrest or revo- 
cation in a sample of 203 male offenders 
paroled from Canadian penitentiaries. The 
researchers found that inclusion of clinical 
variables in the model made only minimal 
improvements in the ability to correctly clas- 
sify success and failures. 

Influential descriptive reviews of com- 
pilations of research studies comparing the 
relative efficacy of actuarial predictions and 
clinical judgments appear in Meehl (1954) 
and Grove and Meehl (1996). The latter work 
is particularly extensive, reporting on 136 


published studies in a wide variety of con- 
texts, including success in employment and 
education, adjustment to military life, psy- 
chotherapy outcome, and medical diagnoses 
as well as recidivism. Together, these reviews 
indicate that predictions based on clinical 
judgment only rarely outperform actuarial 
assessments and, more frequently, that the 
latter match or exceed clinical predictions 
in accuracy. Of the 136 studies examined by 
Grove and Meehl, 8 favored clinical judg- 
ment, 64 favored actuarial methods, and 64 
exhibited “approximately equivalent accu- 
racy” (p. 298). 

These 136 studies were later subjected 
to a meta-analysis by Grove et al. (2000). 
The analysis of effect sizes revealed that 
statistical predictions outperformed clini- 
cal predictions on average 10 percent of the 
time, and greatly exceeded the latter in at 
least one-third of comparisons. The actu- 
arial predictions of criminal or delinquent 
behavior reported in this study, by the way, 
were always more accurate than clinical pre- 
dictions of the same. 

The Grove and Meehl (2000) study is 
noteworthy not just for the number of stud- 
ies analyzed but because the authors rule out 
competing explanations for the superior per- 
formance of actuarial predictions, includ- 
ing the assessor's field of training, length 
of experience, and task-related experience. 
In addition, they confirmed that actuarial 
predictions were more accurate than clini- 
cal predictions, even when clinicians had 
access to and employed a greater number of 
variables than was available for use in the 
statistical prediction. 

A different approach to comparing clin- 
ical and actuarial assessment appears in 
Mossman (1994), who calculated Area under 
the Curve (AUC) statistics for each of 44 
published studies involving the prediction 
of violence in a total of 16,000 subjects 
consisting of parolees, psychiatric patients, 
and indictees. The AUC represents the prob- 
ability that a subject has been correctly 
classified relative to a subject classified by 
chance (Hanley & McNeil, 1982). An AUC 
value of .50 indicates a prediction that does 
not improve upon chance. The higher the 
AUC value, the more accurate is the predic- 
tion and greater is the improvement upon 
chance. The AUC is now a favored statis- 
tic for summarizing prediction accuracy, 
not just because it takes both false positive 
and false negative errors into account, but 
because it is independent of both base rates 


(the frequency of negative outcomes in the 
sample in question) and cutoffs used for 
delineating high- from low-risk cases (Rice 
& Harris, 1995), which could otherwise favor 
prediction outcomes. 

Unlike the works examined by Meehl 
(1954) and Grove and Meehl (1996), none of 
the studies included by Mossman reported 
direct comparisons of prediction methods, 
but rather outcomes from a single prediction 
method. Techniques represented in Moss- 
man’s compilation included clinical judg- 
ment, use of past behavior as a prediction 
device, and discriminant analysis, both with 
and without cross-validation. By comparing 
the value of AUC across the various studies, 
Mossman demonstrated that clinicians were 
able to differentiate violent from nonviolent 
subjects with “a modest, better-than-chance 
level of accuracy” (p. 790), a finding that was 
unaffected by whether the prediction was 
short- or long-term. Predictions generated 
by discriminant analyses, including those 
involving cross-validation where shrinkage of 
prediction accuracy is the norm, were supe- 
rior to those produced by other methods. 

Studies favoring actuarial methods are 
not without their detractors. Litwack (2001) 
observes that efforts claiming to directly com- 
pare clinical and actuarial assessments really 
do not provide such contrasts; studies that 
employ offenders released from institutions as 
the population as subjects for follow-up do not 
include all offenders recommended for release 
and can include offenders released against the 
clinician’s judgment. Further, the clinicians 
whose judgments were studied may not have 
had access to the array of variables used in 
statistical predictions. Litwack also objects to 
the studies’ lack of cross-validation in both 
clinical and statistical predictions on new 
samples, which would allow for a comparison 
of shrinkage across the two methods. 


Structured clinical judgment versus 
actuarial risk prediction 


Most contemporary research regarding the 
“clinical versus actuarial” question examines 
the relative efficacy of specific assessment 
tools (e.g., Barbaree, Seto, Langton, & Pea- 
cock, 2001; Douglas, Yeomans, & Boer, 2005; 
Grann, Belfrage, & Tengstrom, 2000; Gray et 
al., 2004; Kroner & Mills, 2001). Character- 
istic of these analyses is the comparison of 
the accuracy of predicted outcomes obtained 
from the application of various structured 
clinical and actuarial instruments, using the 
same population of offenders. 
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Instruments of the structured clinical 
judgment type are like actuarial tools in that 
they are also founded in empirical research. 
They differ from actuarial tools in that items 
were not included specifically for their cor- 
relations with recidivism. Despite this dif- 
ference, however, overall scores may predict 
outcomes of interest with fairly good suc- 
cess. This may be true even of instruments 
such as the PCL-R that were developed for 
purposes other than prediction of recidivism 
(Hemphill & Hare, 2004). 

Several representative studies help to 
illustrate the kinds of outcomes produced 
by comparison of actuarial and structured 
clinical tools. Douglas, Yeomans, and Boer 
(2005) compared the predictive accuracy 
of five assessment tools using a randomly 
selected group of 188 male offenders released 
from federal institutions in Western Canada 
onto community supervision between 1989 
and 1994, nearly all of whom (98.4 percent) 
had at least one conviction for a violent 
offense. Instruments used in the analysis 
included the Violent Offender Risk Assess- 
ment Scale (VORAS; Howells, Watt, Hall, & 
Baldwin, 1997), the Violence Risk Apprais- 
al Guide (VRAG; Quinsey, Harris, Rice & 
Cormier, 2006), the Historical-Clinical-Risk 
Management-20 (HCR-20), the Hare’s Psy- 
chopathy Checklist-Revised (PCL-R), and 
the Hare’s Psychopathy Checklist: Screening 
Version (PCL:SV). Of the five tools studied, 
just two—the VRAG and VORAS—merited 
categorization as actuarial instruments. 

The offenders were followed up for an 
average of 7.68 years. Analyses of the pre- 
dictive accuracy of each instrument’s total 
scores with respect to new violence revealed 
that the HCR-20 produced the highest AUC 
value. The value of the AUC for the HCR-20 
total score was .82, compared with .79 for 
the VRAG, .76 for the PCL-R, .73 for the 
PCL:SV, and .61 for the VORAS. Particu- 
lar components of some structured clinical 
instruments also produced high AUC val- 
ues, including factor 2 of the PCL-R, with 
an AUC of .82, and the Risk Management 
scale of the HCR-20, with an AUC of .80. 
These results led the authors to conclude 
that a position of “strict actuarial authority” 
is unfounded. 

Barbaree, Seto, Langton, & Peacock (2001) 
reported a comparison of the predictive accu- 
racy of the PCL-R and six instruments com- 
monly employed to assess sex offender risk, 
using 215 sex offenders released from a sex 
offender treatment program in a Canadian 


prison between 1989 and 1996 and followed 
up for an average of 4.5 years. In addition to 
the PCL-R, other instruments included the 
VRAG and five tools developed specifically 
for the prediction of sex offender risk: the 
MASORR, the Sex Offender Risk Appraisal 
Guide (SORAG; Quinsey, Harris, Rice & 
Cormier, 2006), the Rapid Risk Assessment 
of Sex Offender Recidivism (RRASOR; Han- 
son, 1997), the Static-99 (Hanson & Thorn- 
ton, 1999), and the Minnesota Sex Offender 
Screening Tool, Revised (MnSOST-R; Epper- 
son, Kaul, & Hesselton, 1998). The VRAG, 
SORAG, RRASOR, Static-99, and MnSOST- 
R are actuarial tools; as noted earlier, both 
the PCL-R and MASORR are structured 
clinical judgment instruments. 

The authors used three outcome mea- 
sures: any recidivism, any serious recidivism, 
and any sexual offense recidivism. Analysis 
revealed that while some instruments were 
fairly good at predicting all three outcomes, 
no one instrument was superior in all three. 
For example, the SORAG yielded highest 
AUC values for any recidivism and any seri- 
ous recidivism, at .76 and .73, respectively, 
but the RRASOR produced the highest AUC 
for any sexual offense recidivism, at .77. 
Of no small significance, the RRASOR is 
very easy to use and score. Finally, instru- 
ments falling within the structured clini- 
cal judgment category (i.e., the PCL-R and 
MMASOR) yielded less successful predic- 
tions of new sexual offending than those that 
were actuarial. 

Gray et al. (2004) examined the predictive 
accuracy of three instruments with varying 
clinical content in a sample of 315 mentally 
disordered offenders following release from 
a medium security institution in the United 
Kingdom between 1992 and 1999. Offenders 
were tracked for at least 3 years. Instruments 
included the PCL-R, the HCR-20, and the 
Offender Group Reconviction Scale (OGRS; 
Copas & Marshall, 1998). Instruments were 
chosen for their differential emphases on 
clinical variables, with the PCL-R relying 
most heavily on clinical judgment, the OGRS 
using no subjective measures, and the HCR- 
20 falling in between. Using various analytic 
techniques, the authors consistently found 
that the OGRS yielded the most accurate 
predictions of reoffending. 

In summary, comparisons of structured 
clinical judgment instruments with actu- 
arial tools sometimes find that the former 
can produce results on par with or even bet- 
ter than the latter. More frequently, however, 


actuarial tools yield highest AUC values in 
prediction outcomes. 

Comparisons like the ones summarized 
above have some limitations. Just because 
one tool happens to yield a higher AUC 
compared to another in the same sample of 
offenders does not mean that the difference 
between the instruments’ AUCs is statisti- 
cally significant. For example, Kroner and 
Mills (2001) conducted a comparison of 
the PCL-R, HCR-20, VRAG, Level of Ser- 
vice Inventory, Revised (LSI-R; Andrews & 
Bonta, 2001), and the Lifestyle Criminal- 
ity Screening Form (LCSF; Walters, 1997). 
While actuarial instruments (VRAG and 
LSI-R) yielded more accurate predictions 
than the others, the differences between the 
AUC values were not statistically significant, 
suggesting a larger than desirable probabil- 
ity that they were due to chance factors in 
sampling alone. Hemphill and Hare (2004) 
point out that comparisons of the PCL-R 
with instruments designed to measure offi- 
cial indices of recidivism overlook its util- 
ity in predicting a wide range of antisocial 
conduct, such as institutional escapes, viola- 
tions of community supervision, and devi- 
ant sexual interest. Thus, failure to include 
such behaviors in an outcome measure can 
suppress the value of the PCL-R in making 
predictions about offender risk. 


Which assessment method is better, 
and why? 


In summary, the case for or against clinical 
judgment is more complex than is typically 
represented to the field of community cor- 
rections. At this time, there is no strong 
empirical case to be made for risk assess- 
ments based on unstructured clinical judg- 
ments. Only rarely are their predictions of 
greater accuracy than actuarial methods; 
much more frequently, they are inferior. 
Though some research demonstrates that 
structured clinical assessments can fare as 
well or better than some actuarial tools in 
some populations, there are important con- 
siderations that nonetheless tip the scales in 
favor of actuarial tools. Generally speaking, 
structured clinical tools are neither intended 
for nor have been validated on general popu- 
lations of offenders. Certain instruments 
of the structured clinical variety can take 
longer to administer than the typical actu- 
arial tool. For example, the PCL-R involves a 
three-hour interview. (While the PCL-R can 
also be coded from existing files, sufficient 
documentation would not be available in the 
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files of most offenders entering community 
supervision, even if only felons were con- 
sidered.) Whatever their superiority under 
some conditions and in specific populations, 
there is no compelling reason to favor these 
instruments over actuarial tools that have 
been purposely developed for use in assess- 
ing general populations of offenders under 
community supervision. 

Readers not yet persuaded by this discus- 
sion should note that much corroborating 
research exists outside of the “actuarial ver- 
sus clinical” arena to substantiate the sug- 
gestion that individuals acting on their own 
judgment are notoriously poor at estimating 
risk (see, e.g., Connolly, Arkes, & Ham- 
monds, 2000; Kahneman, Slovic, & Tverksy, 
1982; Slovic, 2000a). Risk perception is an 
inherently error-prone process, affected by 
biases and speculative principles formed on 
the basis of a person’s limited experience. 
For example, decision-makers typically fail 
to take base rates (the frequency with which 
the outcome of interest occurs in the popula- 
tion) into account when making predictions. 
To some extent this may be a function of 
not knowing base rates, yet research finds 
individuals will take base rates into account 
only when there is no other information and 
that they will give preference to irrelevant 
information, such as a stereotype, over base 
rates when both are available (Tverksy & 
Kahneman, 1974). 

Individuals generally overlook cumula- 
tive effects on risks when making predic- 
tions (Slovic, Fischhoff, Lichtenstein, 1982). 
Retrievability and salience factors also dis- 
tort the accuracy of subjective predictions. 
Retrievability refers to risk misrepresenta- 
tion that occurs when easily remembered 
events are mistaken for frequently occurring 
events. Likewise, the salience (impact) of an 
event will cause decision-makers to unjusti- 
fiably heighten estimates of its reoccurrence 
(Tverksy & Kahneman, 1974). Females and 
males tend to assign different probabilities 
to identical events, even when they have 
training in the same field (Barke, Jenkins- 
Smith, & Slovic, 1997). The interaction of the 
decision-maker’s race and sex also accounts 
for variation in risk perceptions, possibly 
due the perceived vulnerability of particular 
groups relative to others (Satterfield, Mertz, 
& Slovic, 2004; Slovic, 2000b). 

In the absence of empirically based struc- 
tured decision-making aids for predictions 
about offender risk, even the most trained 
and seasoned professionals make predictions 


_that typically perform no better than chance 


(Monahan, 1981). Of course, some unas- 
sisted predictions can be less accurate than 
a chance-based classification system. Errors 
are especially likely when corrections practi- 
tioners are asked to make program decisions 
that have the potential to affect public safety. 
The tendency to make predictions that err on 
the side of public safety leads to unnecessar- 
ily restrictive decisions in offender release or 
supervision, otherwise known as overclas- 
sification. For example, Bonta and Motiuk 
(1990) compared rates of recommendations 
for placements in halfway houses across three 
jails, two of which employed the LSI-R as a 
classification instrument and one of which 
relied on a subjective decision-making tool. 
In the jails that used the LSI-R, 51 percent 
of assessed offenders were recommended for 
halfway house placement versus only 16 per- 
cent of offenders classified subjectively. 

In short, actuarial tools are superior 
because they limit discretion that would oth- 
erwise result in erroneous and conservative 
predictions; take advantage of large quanti- 
ties of information, as well as redundant 
and multiple measures, to maximize predic- 
tion accuracy; help to predict recidivism in 
diverse groups of offenders; and, in com- 
parison to structured professional judgment 
tools, take less time to administer. On the 
other hand, the contributioas of structured 
clinical judgment cannot be discounted with 
respect to certain populations of offenders. 
As Quinsey, Harris, Rice, and Cormier note 
(2006, p. 72), “human judgments applied in 
a very structured way play a large role in the 
actuarial prediction of violence.” 


The Communication of 
Offender Risk 


A growing body of research indicates that 
the manner in which a risk prediction is 
reported can alter the user’s understanding 
and acceptance of assessment results. For 
example, Slovic, Monahan, and MacGregor 
(2000) found that forensic psychologists and 
psychiatrists were more likely to view a 
client’s risk as higher when that risk was 
reported as a frequency (e.g., 10 out of 100 
subjects with the client’s characteristics are 
known to reoffend) than as a probability 
(e.g., the subject has a 10 percent likelihood 
of reoffending), though both portrayals 
clearly represent identical risks. Slovic and 
Monahan (1995) found that the range of 
the response scale provided to clinicians 
affected the magnitude of the risk decision 
they eventually rendered. That is, the clini- 
cians were far more likely to assign lower 


probabilities to a client’s future dangerous- 
ness when provided with a scale that had six 
values between 0 and 10 percent, than when 
provided with one that included just two 
values, 0 and 10 percent. 

Two studies in particular help to high- 
light particular features of offender risk 
communication that may help to explain 
why some users remain resistant to actuarial 
instruments, however advanced empirical 
justifications for these instruments may be. 
While such studies focus specifically on 
the communication of risk of violence and 
employ only mental health professionals in 
the role of decision-makers, they have impli- 
cations for communication of general risks 
of recidivism in community corrections. 


Resistance to quantification of 
offender risks 


In the first study, Hilton, Harris, Rawson, 
and Beach (2005) asked the question: What is 
the best way to “package” objective, statisti- 
cal risk information to clinical staff working 
in a forensic mental hospital to encourage 
more widespread use of that information? 
To answer these questions, the authors pre- 
sented study participants with several varia- 
tions of two hypothetical cases, the first, a 
lower-risk subject (.24 probability of recidi- 
vism over 10 years) and the second, a higher- 
risk subject (.64 probability over 10 years). 
Both were male patients in maximum secu- 
rity. The descriptions of hypothetical cases 
included either risk-relevant information 
(taken from the VRAG) or risk-irrelevant 
information (e.g., subject’s weight, health, 
and personal preferences). Descriptions also 
included different summaries of subjects’ 
likely risk, which took the form of one of the 
following a) a probability; b) a frequency; or 
c) a statement that no summary of risk was 
yet available. Taking all possible variations 
into account, there were 6 iterations of the 
high-risk case, and 6 of the low-risk case. 

Next, the researchers asked the clini- 
cians to 1) estimate the offender’s likelihood 
of reoffending over the next 10 years, on a 
scale from 1 to 100; 2) rate the offender's 
risk compared to other forensic patients; 
and 3) report which information items most 
affected their assessment. 

Results indicated that when given descrip- 
tive information about risk combined with 
the probability or frequency of reoffending, 
the clinicians tended to inflate their percep- 
tions of the client’s risk. That is, more risk 
information resulted in higher, but more 
inaccurate, estimates of actual risk. When 
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clinicians were asked to rate likelihood of 
reoffending without having the benefit of 
either probability or frequency information, 
they overestimated the likelihood of reof- 
fending for the low-risk case. Further, their 
estimates of the high-risk case were actually 
more accurate in the absence of risk-relevant 
information than when descriptive informa- 
tion was combined with percentages and fre- 
quency estimates. When asked to name the 
most important information items affecting 
their decisions, participants identified the 
subject’s case history information as the 
most influential. 

These results led Hilton et al. to conclude 
that though probabilities can be as persua- 
sive as frequencies in conveying likelihood 
of reoffending, relevant case history infor- 
mation leads to inflation of offender risk 
estimates. Thus, decision-makers would be 
better off with only quantitative estimates of 
likelihood of re-offending, versus narrative 
information about a case. 

This study indicates that the biases and 
heuristics known to distort perceptions of 
risk in persons acting only on subjective 
judgment may be present even when formal 
estimates of risk have been prescribed. In the 
field of community corrections, the officer 
who performs an assessment is typically the 
one who uses its results. This same indi- 
vidual would have access to and be aware of 
descriptions of prior offending, along with 
a great deal of other information disclosed 
by the offender during the assessment inter- 
view. In addition, the officer has opportunity 
to react to descriptive case details in the form 
of police reports and pre-sentence investiga- 
tions. In combination, these facts may help 
to explain why some officers lack confidence 
in quantitative assessment results. Possibly, 
the content of training in risk assessment 
should be expanded to warn community 
supervision officers about the pitfalls of their 
attraction to descriptive information, and to 
the perils of emphasizing such inputs over 
assessment results. 


Preference for management-oriented 
risk communication 


In the second study, Heilbrun et al. (2004) 
sent a survey to a random sample of 1,000 
psychologists, identified through the Ameri- 
can Psychological Association member- 
ship database. The 256 psychologists who 
responded to the survey reviewed eight 
vignettes in which three variables appeared 
in diverse combinations. Variables included 
risk level (high, medium, or low); risk factors 


(static and/or dynamic versions of substance 
abuse, medication non-compliance, and vio- 
lence); and risk model (prediction-oriented, 
involving a decision to civilly commit, where 
the court would relinquish jurisdiction over 
the offender upon commitment; or manage- 


ment-oriented, involving a decision to grant 
an inmate parole release, where the parol- 
ing authority would enforce conditions of 
supervision). For each vignette, respondents 
were asked to rate the relative usefulness of 
each of six different ways of communicating 
the subject’s risk of committing a violent act 
toward others. Methods of risk communica- 
tion included a) the probability of violence 
over a forthcoming period of months; b) age 
and status of the subject (such as whether the 
subject had a history of violence or substance 
abuse); c) subject’s level of risk of commit- 
ting a violent act, stated as high, medium 
or low; d) a statement that the subject’s risk 
of violence was dependent upon particular 
risk factors, with information about inter- 
ventions that could control the risk; e) a 
statement that the subject was or was not 
dangerous; and f) likelihood that the subject 
would commit a violent act, stated as a per- 
cent. Of the six alternatives, d) represents a 
management model of risk communication, 
b), a descriptive model, and all others, pre- 
dictive models. 

Analysis of responses indicated that deci- 
sion-makers least preferred “likelihood that 
the subject would commit a violent act” as a 
method for communicating risk, of the six 
alternatives. Second, decision-makers favored 
prediction models of risk communication in 
scenarios involving static factors, but preferred 
management models when dynamic factors 
were presented. This preference was further 
heightened in the presence of high risk. 

In the context of community supervision, 
the study provides an additional explanation 
for officers’ resistance to actuarial predic- 
tions. If they are like the clinicians who 
responded to Heilbrun et al.’s survey, officers 
find predictions inadequate when forecasts 
are not accompanied by identification of 
specific risk factors and appropriate inter- 
ventions. According to Heilbrun et al., these 
enhancements may make decision-makers 
less likely to reject the prediction itself. 

Development and implementation of a 
“fourth” generation of risk assessment tools 
such as discussed by Andrews, Bonta, and 
Wormith (2006), wherein outcomes are tied 
not just to predicted risk but the offender’s 
needs, strengths, and responsivity factors as 
well, would go a long way toward address- 


ing this potential hindrance to acceptance 
of actuarial tools. However, the findings of 
Heilbrun et al.’s study on communication of 
risk assessment suggest that these ties should 
be stated directly. Interestingly, some suggest 
that the deliberate pairing of an individual- 
ized statement of risk with an explanation of 
how risk factors may be modified to reduce 
risk is a means for mediating the “actuarial 
versus clinical polemic” that has permeated 
research on and practice of risk prediction 
(Webster, Hucker, & Bloom, 2002). 


The Importance of Clinical 
Judgment and Skill in Actuarial 
Assessment 


A third potential stumbling block to greater 
acceptance of actuarial risk assessment is 
the choice attached, albeit indirectly, to the 
“clinical versus actuarial” debate. In embrac- 
ing statistically derived assessment tools, 
officers may erroneously come to believe 
they must simultaneously relinquish care- 
fully cultivated professional judgments and 
skills for determining offender risks. Regret- 
tably, the phrase “clinical versus actuarial” 
is the source of a false dilemma, because 
successful execution of actuarial risk assess- 
ments is utterly dependent upon the officer's 
competent exercise of clinical judgments 
and skills. 

To carry out a valid and reliable actuarial 
risk assessment, an officer must possess and 
draw upon effective interviewing techniques. 
These include a constellation of clinical skills 
that, when intertwined with apt professional 
judgments, promote offender disclosure, a 
prerequisite for accurate instrument scoring. 
Relevant skills include adequately reflecting 
back the feeling and meaning in offender's 
responses, to facilitate rapport but also to 
validate interviewer understanding of the 
subject’s replies. Officers must be skilled in 
the art of the open-ended questioning tech- 
nique, not just to maintain rapport but also 
to avoid limiting or otherwise influencing 
the content of the offender’s response. In 
addition, officers need to be able to suspend 
judgment, affirm at times, and refrain from 
blaming and advising throughout the inter- 
view, for the sake of maximizing disclosure. 
Officers must recognize when they should 
ask for elaboration, such as when they hear 
information that contradicts earlier input, 
when a response is ambiguous, or when they 
simply have not received sufficient informa- 
tion to permit scoring of an item. To deter- 


mine the meaning behind a lack of response,.... 


officers should also be able to recognize 
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when the offender is confused, resistant, or 
merely anxious. 

Some readers may recognize an overlap 
between effective assessment interviewing 
skills and both the general principles and 
opening strategies of Motivational Inter- 
viewing (Miller & Rollnick, 2002). The 
officer’s effective use of empathy, reflec- 
tion, open-ended questions, and affirmation 
throughout the assessment interview begins 
a process of engagement with the offender 
that is gaining recognition as a means for 
increasing the latter’s participation in treat- 
ment and reducing problematic behaviors 
(Moyers, Miller, & Hendrickson, 2005). 

Skill must also be present to bridge mul- 
ticultural divides. To conduct an effective 
assessment, actuarial or otherwise, officers 
must possess an adequate understanding 
of cultural differences as well as sufficient 
sensory acuity for determining whether and 
how those differences may impact client 
disclosure. How best to use eye contact (or 
whether to avoid it), how much time one 
should anticipate awaiting the offender’s 
response, what tone and volume of speech to 
use, how close to sit to the offender, whether 
to expect narrative or direct replies, and 
how to word particular questions to pre- 
vent alienating the offender, are examples of 
knowledge officers can apply to the assess- 
ment context when interviewing individuals 
from cultures other than their own (see, 
e.g., Okun, Fried, & Okun, 1999; Severson & 
Duclos, 2005; Umbreit & Coates, 2000). 

Officers must also draw upon considerable 
judgment and skill to score the assessment. 
Actuarial instruments can be comprised of 
both objective items that are easily scored 
as well as a variety of subjective items, such 
as questions regarding impulsivity, relation- 
ships, employment patterns, and attitudes 
that require extended consideration. Gener- 
ally speaking, questions that are not clearly 
objective require the officer to reflect upon 
the offender’s responses to multiple items in 
the assessment interview. Certain actuarial 
instruments even include items widely rec- 
ognized as “clinical.” For example, subject’s 
overall score on the PCL-R, whether the 
subject meets DSM-III criteria for schizo- 
phrenia, and whether the subject meets 
DSM-III criteria for any personality disor- 
der, all appear as items on both the VRAG 
and the SORAG (Quinsey, Harris, Rice, & 
Cormier, 2006). The PCL-R belongs to the 
category of structured clinical instruments, 
and.assessment according to DSM criteria is 
quintessentially a clinical endeavor. 


After scoring the instrument, officers 
must exercise professional judgment when 
determining whether an override of assess- 
ment results is appropriate. For instance, 
the LSI-R User’s Manual (Andrews & Bonta, 
2001, p. 12) cautions, “it is impossible to 
foresee all possibilities and assess all fac- 
tors that may influence the likelihood of 
criminal behavior. The trained professional 
is encouraged to document features of an 
offender’s situation that may require special 
consideration and that may even override 
the quantitative risk/needs assessment of 
the LSI-R.” 

Above all, officers must use judgment 
and insight following the assessment to 
determine appropriate referrals and fash- 
ion an effective supervision plan for the 
offender. Development of the supervision 
plan requires the officer to look back over 
the whole of the assessment interview to 
identify factors in the offender’s current 
circumstances likely to aggravate the latter’s 
continued involvement in crime, as well as 
forces that will help the offender to avoid it. 


Conclusion 


This essay has addressed three potential stum- 
bling blocks to wider acceptance and utiliza- 
tion of actuarial risk assessment in community 
corrections. While much research finds actu- 
arial tools to be superior to alternatives, due 
either to better accuracy or a combination 
of accuracy and expediency considerations, 
the enterprise of actuarial risk assessment 
requires more comprehensive attention and 
must move beyond mere addition of new 
studies validating particular instruments, if 
a greater embrace by the community correc- 
tions community is to be achieved. This article 
has identified some areas where training about 
risk assessment may be improved. Future 
research should investigate which factors most 
affect officers’ likelihood of accepting and act- 
ing on assessment results, including but not 
limited to scope of training in risk assessment 
and methods for communicating risk assess- 
ment results. 
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IN VIRTUALLY ALL decision-making situ- 
ations that have been studied, actuarially 
developed devices outperform human judg- 
ments. This is true with respect to psychiat- 
ric judgments (see, for example, Meehl, 1965; 
Gough, 1962; Ennis and Litwack, 1974); 
graduate school admissions (e.g., Dawes, 
1979; Dawes and Corrigan, 1974); prognostic 
judgments made by sociologists and psychia- 
trists relative to a parole-violation criterion 
(Glaser, 1955, 1962); parole board decisions 
(Gottfredson, 1961; Gottfredson and Bev- 
erly, 1962; Carroll, Wiener, Coates, Galegher, 
& Alibrio, 1982); mental health and cor- 
rectional case worker judgments of offender 
risk (Holland, Holt, Levi, & Beckett, 1983), 
spousal assault (Hilton and Harris, 2005) 
and in other areas (Goldberg, 1970), includ- 
ing the analysis of credit risk (Somerville 
and Taffler, 1995). Indeed, a recent review 
and meta-analysis of 56 years’ accumulation 
of research on the “clinical vs. statistical” 
prediction “problem” conducted as part of 
a Festschrift for Paul E. Meehl, a pioneer 
in the field, again confirms that statistical 
models outperform clinical decision-makers 
(Agisdottier, White, Spengler, Maugherman, 
Anderson, Cook, Nichols, Lampropoulos, 
Walker, Cohen and Rush, 2006). 

The relative superiority of statistical to 
intuitive methods of prediction is due to 
many factors. For example, human deci- 
sion-makers often do not use information 
reliably (e.g., Ennis and Litwack, 1974), they 
often do not attend to base rates (Meehl 
and Rosen, 1955), and this has been specifi- 
cally illustrated in criminal justice decision- 
making (Carroll, 1977); they may inappro- 


priately weight items of information that 


are predictive, or they may assign weight 
to items that in fact are not predictive; and 
they may be overly influenced by causal 
attributions (e.g., Carroll, 1978) or spurious 
correlations (Monahan, 1981). In fairness, 
it should be pointed out that there may be 
advantages to intuitive judgments as well. 
For example, human decision-makers can 
make use of information that cannot be made 
available to a statistical device (at least read- 
ily). Demeanor during an interview may be 
one such example. Other factors in favor of 
intuitive judgments are reviewed in Dawes 
(1975; Dawes, Faust, and Meehl, 1989).! 

Given these facts, is there reason to still 
consider clinical judgments when determin- 
ing risk-assessment within a justice system 
population? Indeed, with the 1998 publica- 
tion of Violent Offenders: Appraising and 
Managing Risk (Quinsey, Harris, Rice and 
Cormier), we find an argument that we 
should not. “What we are advising is not the 
addition of actuarial methods to existing 
practice, but rather the complete replacement 
of existing practice with actuarial methods” 
(p. 171; see Litwack, 2001 for a strong rebuttal 
in the arena of the assessment of dangerous- 
ness). We argue that even though statistical 
prediction is superior to clinical judgment in 
almost all settings, this does not obviate the 
need for nor value of clinical judgment in a 
variety of arenas, including some criminal 
justice venues. We use the roles of probation 
officers and correctional treatment special- 
ists to provide examples. 


' Portions of the preceding discussion adapted 
from Gottfredson and Gottfredson (1986). 


Stephen D. Gottfredson 
Laura J. Moriarty 
Virginia Commonwealth University 


Probation Officers and 
Correctional Treatment 
Specialists 


Among the largest group of criminal justice 
professionals is that working in corrections. 
And with the vast number of adults and 
juveniles on probation, parole or incarcer- 
ated, the workload of these individuals is 
quite high. 

According to the U.S. Department of 
Labor, Bureau of Labor Statistics, there are 
about 90,600 probation cfficers and cor- 
rectional treatment specialists nationally 
(Bureau of Labor Statistics, 2006), and in 
the federal system, there are approximately 
5,000 officers throughout the United States 
and its territories (personal communica- 
tion, Richard Gayler, May 31, 2006). The 
number of adults on probation in 2004 was 
about 4.1 million (Glaze and Palla, 2005), 
for an average caseload nationally of about 
46. The top three states in terms of employ- 
ment of probation officers and correctional 
treatment specialists are California (13,090), 
Texas (6,100) and Florida (5,760). When 
examining data from 2004 that reports the 
state’s community corrections population, 
we find that Texas has the largest, with 
428,773 adults under supervision, followed 
by California with 384,852, and Florida with 
281,170 (Glaze and Palla, 2005). Using these 
figures, the average caseloads range from a 
high of 70 to a low of 29. 

Qualifications for employment as a pro- 
bation officer or correctional treatment spe- 
cialist vary by state, but a Bachelor’s degree 
in social work, criminal justice or some other 
related field typically is required (Bureau of 
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Labor Statistics, 2006). Some states require 
a more advanced degree—Master of Science 
or Master of Arts in a related field (psychol- 
ogy, sociology, criminology, etc.), often with 
an additional experiential requirement as 
well. Many states require that probation 
officers and correctional treatment special- 
ists receive training, upon completion of 
which the candidate must pass a certifica- 
tion test. Typically, new officers work as 
trainees or for a probationary period before 
they become permanent employees. 

Probation officers supervise offenders 
who have been placed on probation while 
correctional treatment specialists counsel 
and create rehabilitation plans for offenders 
to follow when they are no longer incarcer- 
ated or on parole (Bureau of Labor Statistics, 
2006). Probation officers also spend a great 
deal of time in court, investigate offender 
backgrounds, write pre-sentence investiga- 
tion reports, and recommend sentences and 
treatment plans. Correctional treatment spe- 
cialists may work in jails, prisons, or proba- 
tion or parole agencies, where they might 
evaluate the progress of inmates, develop 
parole and release plans, write case reports 
for parole boards and other decision-makers, 
and/or develop and write treatment plans 
and summaries for clients. 

What we have then, is a large number of 
highly qualified and trained professionals 
who routinely are required to make prognos- 
tic decisions about offenders. Elsewhere, we 
have described ways to improve the reliability 
of these decision-making processes (Gott- 
fredson and Moriarty, 2006). We have argued 
that the use of actuarial devices invariably 
increases the reliability and prognostic valid- 
ity of decisions made in these settings (Gott- 
fredson and Moriarty, 2006), and, as noted 
above, some would argue that the surest 
way to do this is to rely solely on statistical 
prediction, such as risk-assessment tools, as 
a way to increase the accuracy and reliability 
of these decisions (Quinsey, Harris, Rice and 
Cromier, 1999). Is it indeed time to supplant 
human judgment in justice system settings 
with the cold calculus of the actuary? 


Human Judgment 


Judgments are made routinely in a host of 
fields including psychiatry and psychology 
(Kleinmuntz, Faust, Meehl, & Dawes, 1990; 
Dawes et al., 1989); mental health (Aegisdot- 
tier et al., 2006); dangerousness (Litwack, 
2001); economics (Dawes, 1999); forecasting 
(Bunn & Wright, 1991), medicine, engi- 


neering, finance, management (Kleinmuntz 
et al., 1990, p. 146); interpersonal violence 
(Hilton, Harris & Rice, 2006; Mills, 2005); 
and forensics (Hilton, Harris, Rawson & 
Beach, 2005; Harris, Rice & Cormier, 2002), 
in addition to those noted earlier in this 
paper. In most cases, the literature reveals 
strong support for the accuracy of actuarial 
prediction over human judgment. This is a 
longstanding finding, replicated in dozens of 
venues (Dawes et al., 1989; Kleinmuntz et al., 
1990; Westen & Weinberger, 2005). As noted 
earlier, there are many reasons to expect that 
actuarial methods will outperform human 
judgments. In addition to those reasons cited 
above, these methods may be expected to 
provide other benefits: 


Even when actuarial methods merely 
equal the accuracy of clinical methods, 
they may save considerable time and 
expense. ... When actuarial methods are 
not used as the sole basis for decisions, 
they can still serve to screen out can- 
didates or options that would never be 
chosen after more prolonged consider- 
ation. When actuarial methods prove 
more accurate than clinical judgment the 
benefits to individuals and society are 
apparent (Dawes et al., 1989, page 1673). 


Why, then, should we continue to allow 
(indeed, require) probation officers and case 
managers to exercise individual discretion, 
when an actuarially-derived tool may be 
expected to perform better? As Westen and 
Weinberger (2005) remind us in a discus- 
sion of the pioneering work of Paul E. Meehl 
on clinical and statistical prediction, even 
though statistical prediction will routinely 
outperform clinical prediction, we should 
not lose sight of the fact that “actuarial pro- 
cedures are far from infallible, sometimes 
achieving only modest results” (Dawes et 
al., 1989, p. 1673). [For a discussion of the 
methodological and statistical problems 
associated with such applications and the 
resultant fallibilities of such procedures, see 
Gottfredson and Moriarty, 2006]. Still, as 
Dawes (2005) concludes, “whenever statisti- 
cal prediction rules (SPR’s) are available for 
making a relevant prediction, they should be 
used in preference to intuition” (p. 1245). 

But does the superiority of actuarial pro- 
cedures over clinical judgment mean that 
there is no place for clinical judgment in 
predicting behavior? The answer is no: “an 
enormous amount of prediction is made 
by human judgment” (Darlington, 1986, p. 


362). Simply put, clinical methods of deci- 
sion-making rest in the decision-maker’s 
head, while statistical or actuarial methods 
eliminate the human judgment with the 
“conclusions rest(ing) solely on empirically 
established relations between the data and 
the condition or event of interest” (Dawes et 
al., 1989, p. 1668). There are instances when 
clinicians can make valid inferences (Westen 
and Weinberger, 2005), and there are times 
when it is preferable to use both clinical and 
statistical judgments to predict behavior. 
As Darlington (1986, 362) reports, “This 
research does not suggest human judgment 
is generally unnecessary; rather it indicates 
that the most accurate predictions gener- 
ally result from a predictive system in which 
human judgment and statistical analysis are 
mixed according to prescribed rules.” More- 
over, Dawes et al. (1989) assert that there 
are instances when clinical judgment might 
improve the actuarial method. They cite 
specifically the following examples: “judg- 
ments mediated by theories and hence dif- 
ficult or impossible to duplicate by statistical 
frequency alone; select reversal of actuarial 
conclusions based on the consideration of 
rare events or utility functions that are not 
incorporated into statistical methods; and 
complex configural relations between pre- 
dictive variables and outcome” (p. 1670). 


The Place for Personal Judgment 


As some assert, the dilemma once posed 
by “using either the head or the formula is 
no longer the main focus of contemporary 
decision research. Rather, the focus has long 
ago shifted to evaluating the use of both 
modes of information combination in tan- 
dem” (Kleinmuntz et al., 1990, p. 146; and 
see generally, Litwack, 2001). 

In some of the most recent research exam- 
ining violence risk assessment (Hanson, 
2005), we find that the question has shifted 
from whether violence can be predicted, to 
what is the best method of risk assessment. 
The validation research typically found that 
a series of measures reviewed in the article 
showed moderate accuracy in predicting 
violent recidivism. The question then is: 
Might the prediction have been improved if 
clinical judgments were included as well? 

This question is partly answered by 
Douglas, Yeoman, and Boer (2005), who 
studied violence risk in a sample of criminal 
offenders. Douglas and colleagues looked 
at the predictive validity of multiple indices 


of violence risk. Although they. conclude 
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that “several indices were related to vio- 
lent recidivism with large statistical effect 
sizes,....” they also found the findings to 
be “inconsistent with a position of strict 
actuarial superiority, as HCR-20? structured 
risk judgments—an index of structured 
professional or clinical judgments—were as 
strongly related to violence” (479). 


Conclusion 


With very few established exceptions, statis- 
tical prediction clearly outperforms clinical 
judgment. Accordingly, we certainly would 
not advocate use of clinical judgment over 
statistical prediction. For as Dawes reports, 


The superiority of statistical prediction 
is crystal clear when clinical judgment 
is pitted against actuarial analysis in a 
situation where both are based on the 
same information—so that the problem is 
basically one of how to combine it. It also 
has been found that clinical judgment in 
psychology is inferior in situations where 
the important variables captured by the 
statistical model constitute a proper sub- 
set of the variables considered by the 
clinician. It is also true that the statisti- 
cal models need not even be optimal. 
Nevertheless, clinical psychologists make 
a great deal of money by relying on 
their intuitions for combining informa- 
tion and for making predictions, and in 
courts they eschew statistical models, 
instead proudly proclaiming that “in my 
experience...” What happens here is that 
the “inside view” is preferred to the out- 
side one, despite massive evidence that 
that outside one is superior (Dawes, 1999, 
pp. 37-38). 


However, there are times when a combi- 
nation of the two may better serve clientele. 
As Dawes et al. (1989) report, “Clinicians 
might be able to gain an advantage by rec- 
ognizing rare events that are not included 
in the actuarial formula (due to their infre- 
quency) and that countervail the actuarial 
conclusion (p. 1670). 

And while such incidents might be infre- 
quent, it is also true that the probation offi- 
cers and correctional counseling specialists 
must have a role in decision-making that 
goes beyond the mere administering of the 
risk-assessment devices. There is a place 
for human judgment and experience in the 


Historical-Clinical-Risk-Management-20 
(HRC-20) 


decision-making process, and we must value 
their continued consideration. 

However, as noted by Sir Frances Bacon, 
“We do ill to exalt the powers of the human 
mind, when we should seek out its proper 
helps” (as quoted in Hogarth (1980)). In 
light of the well-known tendency for justice 
system decision-makers to concentrate on 
information that is demonstrably not predic- 
tive of offender behavioral outcomes (Gott- 
fredson and Gottfredson, 1986), and the 
potential consequences of this for affecting 
the validity of prognostication (Gottfredson 
and Moriarty, 2006), caution is the order of 
the day. 
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IN THIS AGE OF accountability and per- 
formance-based measures, criminal justice 
professionals are being increasingly required 
by state and federal agencies to demonstrate 
the reliability and validity of their assess- 
ment instruments including brief symptom 
inventories, diagnostic tools, and violence 
risk assessment measures. Risk assessment 
tools assist institutional classification boards 
as well as parole boards to: 1) determine an 
initial security rating and placement into a 
particular facility and program(s); 2) develop 
a rehabilitation treatment plan; 3) assess 
eligibility for early release; and 4) determine 
the type of supervision needed while on 
parole. This article first describes how the 
juvenile justice system assesses youths’ risks 
and needs through Juvenile Assessment 
Centers, then explores common components 
of assessment in the juvenile justice system, 
and concludes with an examination of the 
most commonly used risk and mental health 
assessment tools and the evidence that sup- 
ports their use. 

All experienced probation officers, juve- 
nile counselors, and forensic clinicians 
should have skills in risk assessment. Clini- 
cal assessment knowledge and skills pro- 
vide the foundation for clinical judgments, 
applied research, and evidence-based prac- 
tice. Within the juvenile justice system, pre- 
diction can be operationally defined as an 
assessment of future lawbreaking for the 
juvenile offenders who are officially pro- 
cessed through the system. 

There are two primary types of predic- 
tion: clinical and actuarial. Clinical predic- 
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forensic specialists after they have examined 
an individual’s criminal and psychosocial 
history, and the results from psychosocial 
scales and inventories. Actuarial prediction 
methods are based on known properties, 
parameters and statistical formulas applied 
to identical sets of data (e.g. demographic 
data, criminal history). Because of the two 
authors’ backgrounds in forensic mental 
health and social work, we focus on clinical 
judgments and the most commonly used 
assessment scales for measuring mental 
health status, psychosocial functioning, and 
future criminality. 

There is no single scale or assessment 
tool that can predict future mental health 
status or criminality with 100 percent cer- 
tainty. Behavior, abilities, peer influences, 
family factors and deviant behavior pat- 
terns are not static. They often change with 
age and different experiences. Empirical 
evidence from classic longitudinal studies 
indicate that violent juveniles are strongly 
influenced by male siblings of similar ages, 
delinquent gangs, and small groups of delin- 
quent friends (Farrington & Loeber, 2000; 
Farrington & West, 1990). Therefore, it is 
critically important to use multiple assess- 
ment tools with clients at different points in 
the juvenile justice process. 

Clinical prediction is based on percep- 
tions and judgments in which the juvenile 
justice professional and/or mental health 
clinician uses different data sources, such as 
clinical diagnosis, ratings and scores on psy- 
chosocial risk assessment scales, interviews, 
psychotherapy records, and criminal history 
data to make judgments about the offender's 


placement in institutional or community- 
based treatment programs, progress, and 
discharge from probation. 

The early roots of prediction in juvenile 
justice can be traced to the establishment of 
the first juvenile court mental health clinic 
in Chicago in 1909 (Roberts, 2004), and 
the rapid growth and development of over 
600 child guidance clinics by the late 1950s 
connected to juvenile courts throughout the 
United States (Roberts, 2004). In these clin- 
ics forensic psychiatrists and social workers 
collaborated on behalf of troubled juveniles. 

By the late 1990s, most state juvenile 
correctional agencies included formal and 
informal dangerousness and risk of further 
violence and re-offending in their intake 
classification and assessment centers. The 
goal of risk assessments is twofold: 1) to pre- 
dict the probability that a juvenile offender 
will re-offend; 2) to predict which youths 
are at high risk of exhibiting violence in the 
institution or residential treatment facility, 
or upon release to parole supervision in the 
community. In general, classification deci- 
sions are made based on forecasts regarding 
which treatment/rehabilitation program is 
likely to be effective in changing the behav- 
ior patterns of specific types of juveniles, 
generally viewed as either property-related 
offenders or violent offenders adjudicated 
for offenses against persons. 

One of the most overlooked areas of juve- 
nile justice is the assessment and treatment 
of juvenile offenders with mental health 
disorders, especially co-morbid psychiatric 
disorders. Research indicates that at least two- 
thirds of juver ‘> detainees have one or more 
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mental health disorders in addition to their 
juvenile offenses. Incarcerated juveniles suf- 
fering from impulsiveness, hopelessness and 
depression are at an increased risk of suicide 
ideation, suicide attempts, and death (Rapp- 
Palicchi & Roberts, 2004). The death rate 
from suicide is 4.6 times higher in juvenile 
detention centers than in the general popula- 
tion (Sheras, 2000). Therefore, it is imperative 
that experienced mental health professionals 
be hired by juvenile justice agencies so that 
they can conduct extensive assessments at the 
pre-adjudicatory, incarceration, and commu- 
nity release stages (Rapp-Palicchi & Roberts, 
2004). At the present time, most large juvenile 
probation departments do have a few proba- 
tion officers with expertise in forensic mental 
health assessment and treatment. However, 
the time is now ripe for the National Juvenile 
Detention Association (NJDA), as well as the 
American Correctional Association (ACA), 
and state and county correctional administra- 
tors to follow the lead of the American Proba- 
tion and Parole Association (APPA) in giving 
priority to setting standards, and encouraging 
their members to hire and train staff in all 
aspects of juvenile assessment and treatment. 


Juvenile Assessment Centers 


A promising advancement of juvenile assess- 
ment is the innovative development of cen- 
tralized, single point of entry, intake juvenile 
assessment centers. These assessment cen- 
ters are based on a general model for bring- 
ing together a variety of community agencies 
to one centralized location in which all 
justice system-involved youth can receive 
thorough assessment. Juvenile justice, law 
enforcement, school truancy, diversion pro- 
grams, and other human service agencies are 
centrally located, allowing for efficient and 
comprehensive assessment of youths’ risks 
and service needs (Dembo, Schmeidler, & 
Walters (2004). 

Key elements of juvenile assessment cen- 
ters (JACs) include 1) A single, 24-hour, 
centralized point of contact for all youth in 
contact or at risk of contact with the juvenile 
justice system, 2) Screenings and comprehen- 
sive assessments of youths’ circumstances 
and service needs, 3) Management Informa- 
tion Systems that centralize information 
to avoid repetition and assure appropriate 
treatment, and 4) Case management ser- 
vices that integrate information in order to 
recommend appropriate referrals and follow 
up on youth after they are referred (Dembo, 
Schmeidler, & Walters, 2004). 


Juvenile assessment centers got their start 
in the early 1990s in Florida and quickly 
gained the attention of the Florida legis- 
lature, which was struggling with prison 
overcrowding. With a growing budget due to 
special appropriations, JACs quickly spread 
to several counties across Florida and were 
eventually established in other states, includ- 
ing Colorado and Kansas. Investing further 
in assessment centers with an initiative in 
1996, the OJJDP allocated funds to two 
assessment centers, in Denver, CO and Lee 
County, FL, designated as planning sites to 
develop more assessment centers. Additional 
funds supported improving services at two 
designated enhancement sites in Jefferson 
County, CO and Orlando, FL (Dembo, Sch- 
meidler, & Walters, 2004). 

JACs vary considerably by location, due 
in large part to access resources and the 
unique needs of the communities they are 
serving. For example, many Florida JACs 
work closely with nearby juvenile addic- 
tion receiving facilities to provide detoxi- 
fication, assessment, and stabilization for 
youth with substance abuse problems. JACs 
differ in the range of services they pro- 
vide, from those with only juvenile justice 
agencies to those such as the Hillsborough 
County, Florida JAC that provides an array 
of services, including booking, supervision, 
detention center screening, diversion, and 
truancy programming at one site. JACS 
located in urban settings tend to have longer 
hours, process many youth, and thus con- 
duct more thorough assessments off-site 
(Dembo, Schmeidler, & Walters, 2004). 

Despite these differences, JACs share com- 
mon benefits to the juvenile justice system. 
They provide a centralized site for legally 
required mandates to be carried out more 
efficiently, saving time locating youth, com- 
pleting multiple screenings, and providing 
information to courts for decision making. 
Integrating information in one information 
system allows for better-informed decisions 
regarding need for services and necessary 
level of supervision. Accessing all system- 
involved youth, JACs create a prime oppor- 
tunity for prevention and early intervention. 
Finally, on a macro-level, information from 
JACs informs the community of broader 
juvenile justice trends and needs for new ser- 
vices (Dembo, Schmeidler, & Walters, 2004). 

Dembo et al. (2004) note an ongoing 
struggle for funding experienced by many 
JACs. Consistent funding at the federal and 
state levels is needed in order to provide 


decent salaries to well-trained staff, there- 
by reducing staff turn-over and improving 
quality of service. Additional funds would 
also allow JACs to maintain their origi- 
nal goals of comprehensively responding to 
youths’ multifaceted needs, preventing JACs 
from skimming services and becoming mere 
processing centers. 

With necessary funding and support, the 
future utility of JACs is broad and influen- 
tial. JACs have the potential to play a major 
role in developing empirical knowledge in 
the future. With large sample sizes, JACs’ 
information systems could easily gather data 
on youths’ characteristics, service needs, 
and outcomes in different treatment pro- 
grams, providing juvenile justice research 
with difficult to obtain information. This 
information can then be used to inform pro- 
gram development and service provisions to 
juvenile offenders. 

JACs can also provide much-needed solu- 
tions to assessment, referral, and service 
delivery in the future. By integrating infor- 
mation among many agencies, JACs can 
help to identify youth who slip through the 
system by failing to follow through on treat- 
ment recommendations. Furthermore, pro- 
viding objective measures of substance use 
through urinalysis screening is an invaluable 
service offered through JACs and has impli- 
cations for validating youths’ self reports of 
substance use and subsequent appropriate 
treatment placements. Finally, JACs ensure 
investment in prevention efforts, keeping 
youth from further developing delinquency 
careers; these prevention efforts inversely 
relate to the number of youth requiring 
long-term incarceration that is expensive 
and fairly ineffective (Dembo, Schmeidler, & 
Walters, 2004). 


Components of Risk and Need 
Assessment 


The central goals of youth assessment in 
juvenile justice are: 1) the safety of the com- 
munity by preventing re-offending; and 2) 
youth rehabilitation and clinical treatment. 
In other words, mental health assessments 
seek to identify both risk and treatment 
needs. Assessments must be comprehensive 
and cover several domains. Comprehensive- 
ness includes assessing a youth’s offense 
history, family/environmental factors, edu- 
cation/employment history, peer relation- 
ships, and psychosocial functioning. 
Assessing psychosocial functioning is par- 
ticularly important as the juvenile offender 
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population has elevated rates of mental health 
and substance use disorders (Teplin, Abram, 
McClelland, Duclan, & Mericle, 2003). Fur- 
thermore, these psychosocial factors (i.e. 
personality characteristics, behaviors, affect, 
attitudes, beliefs and interpersonal constructs) 
predict youths’ infractions while incarcerated 
and their behaviors once they are released 
into the community (Cauffman, 2004; Hatha- 
way & Moncachesi, 2003). 

Mental health status is often under- 
assessed and consequently under-treated in 
the juvenile justice system. This is because 
of a lack of resources and trained staff, as 
well as a punishment mentality. Teplin et 
al. (2005) report that only 15.4 percent of 
detained adolescents who needed mental 
health treatment received treatment in the 
detention center; it is estimated that as many 
as 13,000 detained youth with major mental 
health disorders go untreated (Teplin et 
al., 2005). Effective mental health assess- 
ment and treatment are critical for achieving 
effective juvenile justice. 

In 2002 the Consensus Conference con- 
vened, composed of more than 20 researchers 
with expertise in mental health assessment 
and juvenile justice, with the aim of devel- 
oping recommendations for mental health 
assessment in the juvenile justice system. 
The Consensus Conference brought together 
nationally recognized experts in the areas 
of mental health, juvenile justice, and child 
welfare service systems. It was guided by 
data from a national survey of current men- 
tal health assessment practices conducted 
by the Center for the Promotion of Men- 
tal Health Assessment in Juvenile Justice. 
Directed by Gail Wasserman at Columbia 
University’s Department of Child Psychia- 
try, the Center’s national survey provided 
information on the current practices and 
needs of juvenile justice systems across the 
nation. From these findings, the Consensus 
Conference was then able to create recom- 
mendations for standardizing mental health 
assessment practices on a national level. The 
Consensus Conference recommended that 
these four types of assessments should be 
conducted: 

1) Emergent risk needs should be assessed 
immediately upon arrival at a secure facility; 
2) A comprehensive mental health assess- 
ment should be conducted on all youths at 
the facility to identify those needing more 
thorough mental health assessments; 3) Prior 
to community re-entry, all youth should be 


..wwassessed to facilitate transition and referral 


to community mental health services; and 4) 
continued re-assessments should take place 
after the youths have returned to the com- 
munity, to prevent re-offenses. 

In the past two decades, several mea- 
sures have been developed to assess juve- 
nile offenders’ mental health and associated 
risks (Grisso, 2005). These measurement 
instruments aim to be accurate, reliable, and 
thorough while being fairly quick and inex- 
pensive to administer. 


Tools for Mental Health and 
Associated Risk Assessment 


There are currently several well-validated 
assessment measures used to predict the 
likelihood of re-offending upon release, 
mental health treatment needs, and danger 
towards self (suicide ideation and suicide 
attempts) and others, based on the pres- 
ence or absence of substance abuse, suicide 
ideation, personality traits, thought distur- 
bance, and depression-anxiety. Below we 
describe several of the most common assess- 
ment tools used in juvenile justice research 
and practice. Several scales are actuarial 
in nature while others integrate actuarial 
assessment with supplemental clinical judg- 
ment. Instruments are categorized accord- 
ing to their utility as brief screening tools, 
comprehensive assessment instruments, or 
risk assessments predicting recidivism or 
dangerousness in the future. Descriptions 
are intended to give a brief overview and 
should not be considered full reviews. For 
more detailed information on each of these 
instruments, readers are directed to Grisso, 
Vincent, and Seagrave’s (2005) Mental health 
Screening and Assessment in Juvenile Justice 
or to literature by each scale’s developer. 


Brief Screening Tools 


Brief screening tools are instruments that 
can be administered very quickly (usually 
in 30 minutes or less) and help staff to iden- 
tify youth who may be of immediate risk 
to self or others. Furthermore, the screen- 
ings should help staff identify youth in 
need of more comprehensive mental health 
assessment. These instruments should be 
easily administered by front-line staff with 
little specialized training, allowing for quick 
and inexpensive use. Brief screening tools 
should not be used to inform treatment 
plans; instead their utility is in identifying 
those youth in need of emergency mental 
health services or those who need more 
comprehensive assessment that can then 


inform treatment needs. Table 1 describes 
the strengths and limitations of three com- 
monly used brief screening tools. 

MAYSI-2. The Massachusetts Youth 
Screening Instrument—Version 2 (MAYSI- 
2) was developed by Grisso and Barnum 
(2003) as a self-report measure to identify 
youth entering the juvenile justice system 
with thoughts, feels, or behaviors indicative 
of mental health problems. The MAYSI-2 
can be administered by pencil-paper or by 
CD-ROM and consists of 52 yes-no ques- 
tions asking whether each item is true for 
the youth. Seven subscales are assessed, 
including alcohol/drug use, angry-irritable, 
depressed-anxious, somatic complaints, sui- 
cide ideation, thought disturbance, and trau- 
matic experiences. This objective measure 
includes cut-off scores from a normative 
juvenile justice sample that can be used as 
indicators of clinical significance (Grisso & 
Quinlan, 2005). Research evaluating the reli- 
ability of the MAYSI-2 reports internal con- 
sistency ranging from .61 to .86 (Grisso et al., 
2001) and support for test-retest reliability 
on most subscales (Cauffman, 2004). Simi- 
lar positive findings were found in studies 
of validity comparing the MAYSI-2 to other 
standardized scales (Espelage et al., 2003) 
and to the DSM-IV (Wasserman et al., 2004). 
Of interest were several studies that found 
the MAYSI-2 to predict future behaviors 
such as institutional maladjustment, sen- 
tence length, and necessary intervention for 
suicide risk and assaultive behavior (Cauff- 
man, 2004; Stewart & Trupin, 2003). Cauff- 
man and MacIntosh (2006) recently found 
different properties on some subscales, in 
particular the alcohol/drug use, anger-irri- 
tability and suicide ideation subscales, across 
ethnic and gender groups. Further research 
should continue to examine the extent to 
which these subscales are valid measures for 
female and ethnic minority youth. 

POSIT. The Problem-Oriented Screening 
Instrument for Teenagers (POSIT) was devel- 
oped by Rahdert (1991) as a self-report brief 
screening to identify troubled youths’ prob- 
lems in psychosocial functioning requiring 
further assessment. The POSIT, available 
by pencil-paper or by CD-ROM, consists of 
a self-administered questionnaire with 139 
yes-no questions and assesses 10 functional 
areas, including substance use/abuse, physi- 
cal health, mental health, family relations, 
peer relations, educational status, vocational 
status, social skills, leisure/recreation, and 
aggressive behavior/delinquency (Dembo & 
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Anderson, 2005). Youths’ total scores in each 
problem area can be compared to empiri- 
cally-based cut-off scores allowing for a 
classification of low-, medium-, or high-risk 
for that problem area. While the POSIT is 
objectively scored, collateral information is 
recommended to validate youths’ respons- 
es. Research evaluating the reliability of 
the POSIT indicates internal consistency 
exceeded .70 and test-retest reliability signif- 
icantly better than chance (Knight, Good- 
man, Pulerwitz, & DuRant, 2001). Hall, 
Richardson, Spears, & Rembert (1998) found 
high construct validity for the POSIT. Pre- 
liminary research indicate the POSIT is 
useful in classifying youth by predicting 
return to the juvenile justice system (Dembo, 
Turner, et al., 1996). 

CAFAS. The Child and Adolescent Func- 
tional Assessment Scale (CAFAS) was devel- 
oped by Hodges (2000a) to assess youths’ 
everyday psychosocial functioning across 
school, home, community, and work set- 
tings. Different ratings (parents, teachers, 
youth) of youth’s behaviors are obtained 
across 10 subscales (school/work, home, 
community, behavior toward others, moods/ 
emotions, self-harmful behavior, substance 
use, thinking, material needs, and family/ 
social). Questions are asked for each sub- 
scale that identify severe, moderate, mild, or 
no impairment and also include questions 
that indicate strengths or protective behav- 
iors exhibited by the youth. Raters score the 
CAFAS after collecting information based 
on both their own observations and a family 
of instruments that assess the youth’s and 
their caregiver's perspective on everyday 
functioning. Studies report positive results 
for test-rest reliability, validity, and ability 
to predict level of service needs (Hodges & 
Wong, 1997). 

Comprehensive Assessment Instruments 

Before forensic mental health special- 
ists, correctional counselors, and proba- 
tion officers can recommend a treatment 
plan, comprehensive risk assessment data 
must be collected. In comparison to brief 
assessment tools, comprehensive assess- 
ment instruments more thoroughly assess 
several domains of youths’ mental health, 
personality, and psychosocial characteris- 
tics. These assessments often involve longer, 
more intensive interviews and several also 
collect collateral information from other 
settings in the youth’s life (i.e., teachers, 
parents, or chart information). Comprehen- 
sive assessment instruments help to clarify 


mental health needs, can inform treatment 
planning, and are most often conducted 
by professionals or require more involved 
training. Table 2 describes the strengths and 
limitations of three commonly used compre- 
hensive assessment instruments. 

DISC. The Diagnostic Interview Sched- 
ule for Children: Present State Voice Ver- 
sion (Voice DISC) was developed by Shafer, 
Fisher, Lucas, Dulcan, & Scwab-Stone (2000) 
to assess mental health problems and pro- 
vides a diagnosis by evaluating how youth 
meet DSM-IV criteria. Now self-admin- 
istered on the computer, the assessment 
employs a unique pattern of questions based 
on respondents’ answers to previous ques- 
tions, assessing the degree to which they 
meet criteria for more than 30 diagnoses 
(Wasserman, McReynolds, Fisher, & Lucas, 
2005). Subscales include anxiety, mood, dis- 
ruptive behavior, substance use, and miscel- 
laneous (eating disorders, tic disorders, etc.). 
After the assessment tool determines a youth 
meets diagnostic criteria, further questions 
inquire about the severity and frequency 
of these problems in an attempt to under- 
stand impairment. However, youth may be 
limited in their ability to recognize the 
consequences of their own behaviors, and 
it is suggested that clinicians use collateral 
information to determine impairment. DISC 
reports include a list of those diagnoses for 
which the youth met criteria, impairment 
and symptom scores, and a list of “clinically 
significant symptoms.” Acceptable reliabil- 
ity for most diagnoses and good test-retest 
reliability have been reported (Shaffer et al. 
2000). Moderate to poor correlation with 
clinician diagnosis has been found (Aronen, 
Noam, & Weinstein, 1993); however, inde- 
pendent clinical diagnosis is known to be 
fairly subjective and unreliable. 

MMPI-A. The Minnesota Multiphasic 
Personality Inventory—Adolescent (MMPI) 
was developed by Archer (1997) and adapted 
for adolescents by Butcher et al. (1992) and 
is the most widely used personality assess- 
ment (Archer & Baker; 2005). The MMPI 
consists of 478 items with validity scales 
(e.g. defensiveness, tendency to exaggerate, 
response consistency), clinical scales (e.g. 
psychopathology such as depression, anx- 
iety, schizophrenia, antisocial behaviors), 
content scales (e.g. externalizing behaviors, 
anger, low self-esteem), and supplementary 
scales (immaturity, repression). Raw scores 
are converted to t-scores and are compared 
to normative scores, resulting in classifica- 


tion of youth who are clinically elevated, 
marginally elevated, or typically adolescent. 
Early research found scale 4 (psychopath- 
ic deviate) especially helpful in predict- 
ing delinquency (Hathaway & Monachesi, 
1963). This is confirmed in later studies that 
found scale 4 (psychopathic deviate), scale 
8 (schizophrenia) and scale 9 (Hypomania) 
predictive of higher rates of delinquency 
(Archer, Bolinskey, Morton, & Farris, 2003). 
Over 100 studies have examined aspects of 
the MMPI-A, and the instrument is known 
for its good reliability and validity ; a good 
resource is a review by Forbey (2003). 

MACI. The Millon Adolescent Clini- 
cal Inventory (MACI) developed by Millon 
(1993) as a short assessment that provides 
clinical information on a variety of psy- 
chological problems, including psychopa- 
thology, peer difficulties, family problems, 
and confusion about self (Salekin, Leistico, 
Schrum, Mullins, 2005). It also assesses a 
balance of externalizing/delinquency risk 
factors as well as suicidal tendency and risk 
towards self. Based on the DSM-IV, the 
MACI includes: 3 validity scales (disclo- 
sure, desirability, debasement), a reliabil- 
ity scale, 7 clinical syndrome scales (eating 
dysfunction, substance abuse, delinquent 
predisposition, impulsive propensity, anx- 
ious feelings, depressive affect, suicidal ten- 
dency), 12 personality scales (introversive, 
inhibited, doleful, submissive, dramatiz- 
ing, egotistic, unruly, forceful, conforming, 
oppositional, self-demeaning, borderline 
tendencies), and 8 expressed concern scales 
(identity confusion, self-devaluation, body 
disapproval, sexual discomfort, peer inse- 
curity, social intensity, family discord, child 
abuse). Base-rate scores are calculated and 
are interpreted by mental health profes- 
sionals, who first examine validity and reli- 
ability before identifying problem scales 
with elevated base-rate scores. The MAC is 
shown to have good internal and test-retest 
reliability and concurrent and predictive 
validity (Millon, 1993). A recent study by 
Taylor, Skubic-Kemper, Loney, and Kistner 
(2006) extends support for using the MACI 
as a tool for classifying subtypes of serious 
juvenile offenders. Furthermore the MACI 
has been shown useful in assessing clinical 
change from intake to discharge in inpatient 
settings (Piersma, Pantle, Smith, Boes, & 
Kubiak, 1993) and is predictive of recidivism 
(Salekin, Ziegler, Larrea, Anthon, and Ben- 
net (2003)). 
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Risk for Recidivism and Dangerousness 
Assessment Tools 


Research has identified several factors that 
put youth at risk for future violence or 
recidivism. While no definitive list of factors 
has been developed, research has shown that 
there are common pathways to recidivism 
that can be predicted with some accuracy. 
These factors have been used to compose 
assessment tools that measure youths’ risk 
of re-offending once released into the com- 
munity. These instruments help juvenile 


“justice centers make decisions to protect the 
community and identify need for further 


services such as case management. Risk for 
recidivism assessments often involve collect- 
ing collateral information from parents or 
chart materials in addition to interviewing, 
and is time-intensive. Thus, these assess- 
ments require specialized training or a pro- 
fessional degree to administer and score. 
Assessments of future risk behavior include 
varying degrees of clinical judgment to 
interpret the results and make decisions. 
Table 3 describes the strengths and limita- 
tions of three commonly used assessments 
of future risk. 

YLS/CMI. The Youth Level of Ser- 
vice/Case Management Inventory (YLS/ 
CMI; Hoge, Andrews, & Leschied, 2002) is 
designed to predict juvenile offender recidi- 
vism as well as case management needs, 
making it especially useful in planning for 
transitions out of the juvenile justice sys- 
tem. The YLS/CMI assesses the offender as 
high or low risk, assesses need by targeting 
services due to risk factors, and assesses 
responsivity or reaction to interventions. 
The YLS/CMI is composed of six sections: 
1. Assessment of risk and needs (42-item 
checklist assessing prior/current offenses, 
family circumstances, education/employ- 
ment, peer associations, substance abuse, 
leisure, personality/behavior, and attitude); 
2. Summary of risk/need factors (comparing 
scales to normative ranges); 3. Other needs/ 
special circumstances (situational informa- 
tion such as parental drug use or behav- 
ioral records that add information specific to 
youth); 4. Professional override feature (asks 
clinician to use clinical judgment consider- 
ing all relevant information to rate youth’s 
risk level); 5. Contact level (intensive services 
should be recommended for high risk youth); 
and 6. Case management plan (specific goals 
and objectives for reaching goals). Due to 
the complexity and knowledge it requires, 


~wwethesYLS/CMI is completed by a trained 


professional and purposely incorporates a 
degree of clinical judgment to supplement 
the objective portions of the assessment. 
Adequate internal consistency (Rowe, 2002) 
and inter-rater reliability have been found in 
empirical studies (Schmid, Hoge, & Robert- 
son, 2002), except for the leisure/recreation 
subscale, which has a wide range of inter- 
rater reliability (.05-.92). Several subscales 
of the YLS/CMI have been correlated with 
other externalizing measures (Rowe, 2002). 
Ability to predict new charges, new convic- 
tions and serious offense charges have been 
consistently demonstrated with males and 
more inconsistently for girls (Rowe, 2002; 
Schmidt et al., 2002). 

SAVRY. The Structured Assessment of 
Violence Risk in Youth (SAVRY; Bartel, 
Borum, & Forth, 2000) involves professional 
judgment based on systematic appraisal of 
the degree to which youth demonstrate risk 
factors for future violence. The appraisal 
involves assessment of 6 protective factors 
(prosocial involvement, strong social sup- 
port, strong attachments and bonds, positive 
attitude toward intervention and authority, 
strong commitment to school, resilient per- 
sonality traits). The instrument also assesses 
24 risk factors including: historical (history 
of violence, of nonviolent offending, early 
initiation of violence, history of self harm, 
childhood exposure to maltreatment, paren- 
tal criminality, early caregiver disruption, 
poor school achievement), individual (nega- 
tive attitudes, risk taking/impulsivity, sub- 
stance use difficulties, anger management 
problems, low empathy/remorse, ADHD dif- 
ficulties, poor compliance, low interest/com- 
mitment to school) and social/environmental 
(peer delinquency, peer rejection, stress/poor 
coping, poor parental management, lack of 
personal/social support, community disor- 
ganization) domains. Information should 
be gathered by the examiner through inter- 
views with the youth, review of records, and 
observation (Borum, Bartel, Forth, 2005). 
Numerical ratings are not the goal of this 
assessment; identifying empirically validat- 
ed risk factors specific to each youth is the 
goal. Thus clinicians are faced with review- 
ing the identified risk factors and making 
a clinical judgment about a youth’s overall 
risk. Inter-rater reliability is moderate to 
high (.81) (Catchpole & Gretton, 2003) and 
studies show support for concurrent validity 
as compared to the YLS/CMI and PCL:YV 
(Catchpole & Gretton, 2003). Moderate yet 
significant correlations were found between 


the SAVRY and measures of violence and 
aggression (McEachran, 2001; Gretton & 
Abramowitz (2002). Additionally, those 
youth characterized as low risk had violent 
recidivism rates (6 percent) much lower than 
those characterized as moderate (14 percent) 
or high risk (40 percent) (Catchpole & Gret- 
ton, 2003). 

PCL:YV. The Hare Psychopathy Check- 
list: Youth Version (PCL:YV; Forth, Kos- 
son, & Hare, 1990) uses multiple sources of 
information across interpersonal, affective 
and behavior domains to identify symptoms 
predictive of serious psychopathy in adoles- 
cents. The examiner uses information from 
an intensive interview with the youth, collat- 
eral sources, and review of the chart to rate 
the youth according to a 20-item checklist 
including: impression management, gran- 
diose sense of worth, stimulation seeking, 
pathological lying, manipulation of personal 
gain, lack of remorse, shallow affect, lack of 
empathy, parasitic orientation, poor anger 
control, impersonal sexual behavior, early 
behavior problems, lacks goals, impulsivity, 
irresponsibility, failure to accept responsi- 
bility, unstable interpersonal relationships, 
serious criminal behavior, serious violence 
of conditional release, and criminal versatil- 
ity. Total scores provide the number of psy- 
chopathic features observed for each youth 
but do not result in cut off or classification. 
However raters can compare youth’s scores 
to percentile scores based on institutional, 
probation, and community samples. After 
extensive training required to administer 
the PCL:YV, inter-rater reliability scores are 
generally high (.90-.96) and internal con- 
sistency is adequate (.85-.94, Forth et al., 
2003). Moderate correlations with reports 
of delinquency, externalizing symptoms 
and aggression are reported, while the PCL: 
YV (as intended) does not correlate with 
measures of internalizing disorders (Cam- 
bell et al., 2004). Recent studies report the 
PCL:YV significantly predicted both violent 
and nonviolent recidivism (Corrado, Vin- 
cent, Hart, & Cohen, 2004) as well as clean 
urine screens and participation in treatment 
(O'Neill et al., 2003a). Eden, Buffington, 
Colwell, Johnson, & Johnson (2002) further 
support the ability of the PCL: YV to predict 
disciplinary infractions in their sample juve- 
nile sex offenders. However, Spain, Douglas, 
Poythress, & Epstein (2004) found negative 
results, with no relationship evident between 
the PCL:YV and treatment progress. 
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Conclusion and Suggestions for 
Further Research 


Several Brief Assessment Screens have been 
developed, and research supports their abil- 
ity to identify youth with emergent risk, and 
screen for youth who should receive more 
comprehensive mental health assessments. 
A study by Wasserman et al. (2004) confirms 
that such brief tools as the MAYSI-2 are use- 
ful for identifying youth who have a possible 
mental health problem so that they can be 
further evaluated by such tools as the DISC 
to identify a more specific diagnosis. These 
brief assessment tools should be utilized 
for these purposes and they are most effec- 
tive when administered promptly upon the 
youth’s arrival at a secure setting. 

Several of the assessments described in 
Table 2 address the aim of tapping different 
domains of functioning in the assessment. 
Some tools utilize multiple sources of data 
such as files and collateral sources of infor- 
mation in addition to self report, while oth- 
ers do not. This reflects the constant struggle 
to provide thorough assessment with lim- 
ited time/expense resources. Considering 
the importance of accurately assessing and 
treating offender mental health problems, 
we conclude that assessments that tap into a 
range of information sources are worth the 
time and effort. 

It is unclear from the current empirical 
evidence how effective the above reviewed 
assessment tools are in re-assessing risk 
once the youth has been reintegrated into the 
community. Further research is needed to 
clarify whether current assessment tools are 
useful for post-incarceration re-assessment 
or whether other assessments, which take 
into consideration the importance of envi- 
ronmental transition, should be developed 
for that purpose. 

It is important to mention that, while 
great progress has been made in beginning 
to understand and assess juvenile offender 
mental health risk, there is much work to be 
done in testing the ability of these assess- 
ment tools to generalize beyond the popula- 
tion for which they were developed. Recent 
research by Wasserman, McReynolds, Ko, 
Katz, and Schwank (2005) examining the 
prevalence of psychiatric disorders among 
youths at probation intake, reported that 
violent female offenders were up to five 
times more likely to report anxiety disorders 
than their male counterparts. Furthermore, 
of youth with conduct disorder, girls seemed 
to be more likely than boys to have complex 


diagnoses due to elevated rates of co-occur- 
ring internalizing disorders. Research also 
shows that ethnic minority youth are over- 
represented in the juvenile justice system, yet 
few mental health risk assessment tools have 
been tested across gender or ethnic groups 
(Devine, Coolbaugh, & Jenkins, 1998). Fur- 
ther research is needed to evaluate assess- 
ment instruments with female offenders and 
ethnic minority offenders, as research sug- 
gests adaptations may need to be made to 
accurately assess the needs of these vulner- 
able groups (Cauffman et al., 2006). 

One additional facet of risk assessment 
appears particularly lacking in the field of 
juvenile justice. Risk assessments of juve- 
nile offenders need to identify those youths 
likely to re-offend into adulthood, and who 
are likely to be the chronic career criminals. 
Several classic studies have documented the 
pattern of desistance of delinquent behavior 
in young adulthood (Farrington & West, 
1977; Gottfredson & Hirschi, 1990; Elliott, 
et al., 1983). Gottfredson and Hirshi’s (1990) 
key finding was that most juveniles discon- 
tinue their delinquent acts in early adult- 
hood. Farrington and West’s classic study 
indicated that only 22.6 percent of their 
research subjects had subsequent convic- 
tions as adults. Elliott and associates found 
that only 2 to 3 out of every 10 adjudicated 
violent juveniles were arrested for violent 
crimes in adulthood. Also noteworthy is the 
fact that while the majority of delinquent 
youth do not seem to present a long-term risk 
of re-offending in adulthood, there is a small 
group of 5 to 6 percent of different cohorts 
that chronically persist in crime into adult- 
hood and are responsible for a high volume 
of multiple offenses. Wolfgang, Figlio and 
Sellin (1972) reported in their birth cohort 
study that approximately 6 percent of their 
subjects were responsible for over 50 percent 
of the official crimes by the cohort. These 
chronic offenders were likely to have poor 
school grades and achievement, low IQ test 
scores, be of non-white racial background, 
low socioeconomic status, and school drop- 
outs. Farrington (1985) found similar results 
of chronic offending by a small percentage of 
offenders—6 percent committing 49 percent 
of the criminal offenses. 

Risk assessment instruments to date have 
not been well tested in their ability to dif- 
ferentiate those youth who will chronically 
offend into adulthood from those who are 
temporary adolescent offenders. Perhaps 
Hare’s Psychopathy checklist comes closest 


to beginning to identify this type of particu- 
larly serious long-term offender. However, it 
is clear that, while short-term recidivism is 
important to assess, much work is needed 
to expand current mental health assessment 
tools to better identify the potential long- 
term chronic offenders. 
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TABLE 1: 
Strengths and Limitations of Commonly Used Brief Assessment Screens 


BRIEF ASSESSMENT SCREENS 


Instrument 


(developer) 


Subscales 


Classification 


Strengths 


Limitations 


MAYSI-2 


Grisso and 
Barnum (2003) 


Alcohol/drug use, angry-irritable, depressed- 
anxious, somatic complaints, suicide 
ideation, thought disturbance, and traumatic 
experiences 


Youth self report 


Low cost 


Brief administration 
time 


Ease of administration 


Potential social 
desirability 


Need thought 
disturbance for girls 


Test applicability to 
ethnic minority youth 


POSIT 
Rahdert (1991) 


Substance use/abuse, physical health, mental 
health, family relations, peer relations, 
educational status, vocational status, social 
skills, leisure/recreation, and aggressive 
behavior/delinquency 


Youth self report 


Identifies youth in need 
of further assessment 


Public domain 
instrument 


Administer to one 
youth at a time 


Test applicability to 
ethnic minority youth 


CAFAS 
Hodges (2000a) 


School/work, home, community, behavior 
toward others, moods/emotions, self-harmful 
behavior, substance use, thinking, material 
needs and family/social 


Parent rating 
Teacher rating 
Youth self report 


Structured observation 


Easy training for 
administration 


Helps prioritize 
interventions 


Objective measures of 
functioning 


Requires time 
investment in 
observing behaviors 
and collecting collateral 
information 
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TABLE 2: 


Strengths and Limitations of Commonly Used Comprehensive Assessment Tools 


COMPREHENSIVE ASSESSMENT TOOLS 


Instrument 
(developer) 


Subscales 


Classification 


Strengths 


Limitations 


DISC 


Shafer, Fisher, 
Lucas, Dulcan, 
& Scwab-Stone 
(2000) 


Anxiety, mood, disruptive behavior, substance 
use, and miscellaneous (eating disorders, tic 
disorders, etc.) 


Youth self-administered, 
computerized, 
structured interview 


Results in diagnosis 
allowing for more 
thorough planning 


No professional 
training required in 
administration 


Computer 
administration may 
ease discloser of 
suicidal ideation 


Computer skills 
necessary 


Does not address other 
social or environmental 
domains 


Potential social 
desirability 


MMPILA 
Archer (1997) 


Validity scales (e.g. defensiveness, tendency 
to exaggerate, response consistency), 

clinical scales (e.g. psychopathology such as 
depression, anxiety, schizophrenia, antisocial 
behaviors), content scales (e.g. externalizing 
behaviors, anger, low self-esteem), and 
supplementary scales (immaturity, repression) 


Youth self report 


Widely used 


Useful in assessing 
change over time 


Ease of administration 


Requires trained 
professional to 
administer 


Ability to predict 
violent recidivism has 
not been evaluated 


MACI 
Millon (1993) 


Validity scales (disclosure, desirability, 


debasement), reliability scale, clinical syndrome 


scales (eating dysfunction, substance abuse, 


delinquent predisposition, impulsive propensity, 


anxious feelings, depressive affect, suicidal 
tendency), personality scales (introversive, 
inhibited, doleful, submissive, dramatizing, 
egotistic, unruly, forceful, conforming, 
oppositional, self-demeaning, borderline 
tendencies), and expressed concern scales 
(identity confusion, self-devaluation, body 
disapproval, sexual discomfort, peer insecurity, 
social intensively, family discord, child abuse) 


Youth seif repdrt 


Minimum training for 
administrators 


Built in measure of 
validity and reliability 


Consistent with DSM-IV 


Relies on client 
retrospective reports 
rather than file data 


More research needed 
to assess predictive 
ability in juvenile 
justice setting 
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TABLE 3: 


Strengths and Limitations of Commonly Used Assessments of Future Risk 


ASSESSMENTS OF FUTURE RISK 


Instrument 
(developer) 


Subscales 


Classification 


Strengths 


Limitations 


YLS/CMI 


Hoge, Andrews, 
& Leschied 
(2002) 


Assessment of risk and needs (prior/current 
offenses, family circumstances, education/ 
employment, peer associations, substance 
abuse, leisure, personality/behavior, and 
attitude); Summary of risk/need factors 
(comparing scales to normative ranges); Other 
needs/special circumstances (situational 
information such as parental drug use or 
behavioral records that add information specific 
to youth); Professional override feature (asks 
clinician to use clinical judgment considering 
all relevant information to rate youth’s risk 
level); Contact level (intensity of services 
recommended; Case management plan (specific 
goals and objectives for reaching goals) 


Trained professional 
completes structured 
assessment using 
information from youth 
interview, file review, 
and collateral sources 


Can be administered 
by “front-line” staff 


Assesses risks and 
needs 


Requires time to review 
collateral materials 


Low reliability on one 
subscale 


SAVRY 


Bartel, Borum, 
& Forth (2000) 


Protective factors (prosocial involvement, 
strong social support, strong attachments and 
bonds, positive attitude toward intervention 
and authority, strong commitment to school, 
resilient personality traits); Risk factors including: 
historical (history of violence, of nonviolent 
offending, early initiation of violence, history of 
self harm, childhood exposure to maltreatment, 
parental criminality, early caregiver disruption, 
poor school achievement), individual (negative 
attitudes, risk taking/impulsivity, substance 

use difficulties, anger management problems, 
low empathy/remorse, ADHD difficulties, 

poor compliance, low interest/commitment 

to school) and social/environmental (peer 
delinquency, peer rejection, stress/poor coping, 
poor parental management, lack of personal/ 
social support, community disorganization) 


Examiner uses 
information from a 
systematic assessment 
of risk and protective 
factors collected 
through interview 
with youth and review 
of records (police/ 
probation, mental 
health, social service 
reports) to make a 
structured professional 
judgment 


Does not provide 

a decision or cut 

off point requiring 
knowledge of how 
identified factors relate 
to behaviors 


No formalized training 
provided 


Requires qualified 
examiners 


Predicts case specific 
violence not general 
violence likelihood 


PCLAV 


Forth, Kosson, 
& Hare (2003) 


Impression management, grandiose sense of 
worth, stimulation seeking, pathological lying, 
manipulation of personal gain, lack of remorse, 
shallow affect, lack of empathy, parasitic 
orientation, poor anger control, impersonal 
sexual behavior, early behavior problems, lacks 
goals, impulsivity, irresponsibility, failure to accept 
responsibility, unstable interpersonal relationships, 
serious criminal behavior, serious violence of 
conditional release, and criminal versatility 


Examiner uses 
information from an 
intensive interview with 
the youth, collateral 
sources, and review 

of the chart to rate 

the youth on 20-item 
checklist 


Identifies risk factors 
for potentially very 
serious offenders 


Complex training and 
advanced graduate 
degree recommended 
for administering 
assessment 


Controversy over 
stigmatizing youth with 
psychopathy label 


CRIMINAL JUSTICE AGENCIES have 
become de facto settings for mental health 
treatment and other clinical services (Lurigio 
& Swartz, 2000). The growing number of per- 
sons with serious mental illness (SMI) who 
appear at each step in the criminal justice 
process—from arrest to post-incarceration 
' release—has forced professionals in correc- 
tions, who usually have no backgrounds or 
experiences in the mental health field, to 
continually face the challenges of identify- 
ing, referring, and case managing the men- 
tally ill. With or without special training or 
guidelines, most criminal justice profession- 
als (such as police officers, probation officers, 
prison intake workers, and parole agents) 
screen individuals for mental illness in their 
daily practice. They do so in order to make 
decisions about such options as diversion, 
segregated housing, treatment, or other spe- 
cialized interventions. 

Screening for SMI involves a brief initial 
evaluation about a client’s need for mental 
health services, which can be done at the 
point of arrest, sentencing, or imprisonment. 
It also can be done formally or informally, 
and can trigger either an immediate decision 
or a more comprehensive psychiatric evalu- 
ation designed to help make subsequent 
mental health-related decisions about a case. 
For example, police officers at the scene 
determine who should be diverted for an 
emergency hospitalization instead of arrest- 
ed and who is at risk of attempting suicide in 
Tockup. Probation officers conduct mental 


health screening as part of an overall needs 
assessment to determine offender classifica- 
tion and service brokerage. Detention and 
correctional officers screen incoming prison 
inmates and jail detainees for mental illness 
in order to assign them to specialized hous- 
ing and programming. 

This article examines the use of actuarial 
screening tools that have been developed to 
flag persons with SMI for further assess- 
ment, diagnosis, and treatment in institu- 
tional and community-based correctional 
facilities. In the absence of such tools, the 
mentally ill in the criminal justice system are 
likely to go unrecognized and untreated. The 
paper is divided into four major sections. 
The first discusses the prevalence of per- 
sons with SMI in jail, prison, and probation 
populations and the dearth of mental health 
services for them. The second emphasizes 
the use of valid and reliable screening tools 
as an important first step in providing ser- 
vices for offenders with mental illness. The 
third presents the results of two studies that 
have tested mental health screening tools for 
use with criminal justice populations. The 
fourth concludes with recommendations for 
the incorporation of mental health screening 
tools in the intake protocols of correctional 
departments. 


Mentally Ill in Criminal Justice 
Settings 


Over the past 20 years, several epidemio- 
logical studies have shown that substan- 
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tial numbers of persons involved in the 
criminal justice system have SMI, such as 
schizophrenia, bipolar disorder, and major 
depression (Abram & Teplin, 1991; Abram 
et al., 2003; Ditton, 1999; Diamond et al., 
2001; Human Rights Watch [HRW], 2003). 
The three largest psychiatric facilities in 
the United States are urban jails: the Los 
Angeles County Jail, the Cook County Jail 
(CCJ) in Chicago, and the jail at Riker’s 
Island in New York City (Insel, 2003). One 
estimate suggests that 900,000 individuals 
with SMI are admitted to our nation’s jails 
annually (Steadman, Scott, Osher, Agnese, 
& Robbins, 2005). Many factors explain the 
large numbers of the mentally ill in offender 
populations. These factors include transin- 
stitutionalization, stricter civil commitment 
laws, homelessness, public order policing, 
and the fragmentation of the mental health 
and drug treatment service systems (HRW, 
2003; Lamb & Weinberger, 1998; Lurigio, 
2005; Lurigio & Swartz, 2000). 

Ditton (1999) has conducted the only 
national study to date on the prevalence of 
the mentally ill in correctional populations. 
She reported that, at midyear 1998, an esti- 
mated 283,800 mentally ili offenders were 
incarcerated in our nation’s prisons and 
jails. A total of 16 percent of those surveyed 
in each population reported either a mental 
health condition or an overnight stay in a 
mental hospital. Approximately 16 percent, 
or an estimated 547,800 probationers, indi- 
cated that they had experienced in their life- 
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time a mental disorder or stayed overnight in 
a mental hospital. 

Based on information from personal inter- 
views, state prison inmates with a mental 
disorder were more likely than other inmates 
to be incarcerated for a violent offense (53 
percent compared with 46 percent) and to 
be under the influence of alcohol or drugs 
at the time of the current offense (59 percent 
compared with 51 percent). They also were 
more than twice as likely as other inmates to 
have been homeless in the 12 months before 
their arrest (20 percent compared with 9 per- 
cent). More than three-quarters of mentally 
ill inmates had been sentenced to prison, jail, 
or probation at least once prior to their cur- 
rent sentence. Since admission, 61 percent of 
mentally ill inmates in state prison and 41 
percent of mentally ill detainees in local jails 
reported that they had received treatment for 
a mental health problem, including coun- 
seling, medication, or other mental health 
services (Ditton, 1999). 

The large number of mentally ill offend- 
ers underscores the need for effective mental 
health screening and treatment services. 
Courts have ruled consistently that jails and 
prisons are legally obligated to provide men- 
tal health and other medical services to 
detainees and prisoners (Diamond et al., 
2001; Veysey & Bichler-Robertson, 2002). 
Left untreated, incarcerated persons with 
SMI have trouble adapting to life in prisons 
and jails, as well as following the written and 
unwritten rules that are inherent in the daily 
routines of correctional facilities (HRW, 
2003; Torrey, 1995). Furthermore, prisoners 
and detainees with SMI are at high risk for 
suicide, disciplinary infractions, and victim- 
ization (Dicataldo et al., 1995; HRW, 2003). 
Similarly, mentally ill probationers often 
have trouble complying with their probation 
orders, such as reporting to their probation 
officers or finding employment and housing 
opportunities. These difficulties increase 
their risk for a technical violation or new 
arrest (Solomon & Draine, 1999). 

The landmark case of Ruiz v. Estelle 
(503 P. Supp. 1265.1323 [1980]) set forth 
standards for “minimally adequate men- 
tal health treatment programs” in prisons. 
These standards consist of the systematic 
mental health screening and evaluation of 
inmates; the capacity to ensure that treat- 
ment involves more than just inmate seg- 
regation; the provision of individualized 
treatment by trained mental health profes- 
sionals; the maintenance of accurate and 


complete mental health records; the super- 
vision and review of prescriptions; and the 
identification of inmates with suicidal ten- 
dencies (Jemelka et al., 1993). Other case 
precedents have established jail detainees’ 
rights to mental health treatment and after- 
care services (e.g., Brad H. et al. v. City of 
New York et al. 185 Misc. 2d 420; 712 N.Y.S. 
2d 336 (Sup. Ct. 2000)). 

Notwithstanding the clear legal man- 
date to provide mental health services, and 
the prodigious numbers of offenders who 
need and receive such services, many men- 
tally ill individuals remain unidentified 
and untreated while under the jurisdic- 
tion of the criminal justice system (Elliot, 
1997; National Commission on Correctional 


Health Care [NCCHC], 2002; Steadman & © 


Veysey, 1997; Teplin, 1990). For example, in 
a study of female jail detainees, Teplin et al. 
(1997) found that only 25 percent of those 
meeting the criteria for SMI received treat- 
ment within one week of admission. As Dit- 
ton (1999) found, nearly 40 percent of prison 
inmates and 60 percent of jail detainees 
with mental illness reported that they were 
receiving no mental health services during 
their recent incarceration. 

These findings mirror those of broader 
studies of mental health care for prisoners 
(ACP et al., 1992; Jordan et al., 1992) and 
probationers (Lurigio et al., 2003), which 
also have found high rates of untreated 
mental illness. For example, a survey of pro- 
bation departments by Boone (1995) found 
that only 15 percent of those responding 
to a national survey had programs for the 
mentally ill. Several factors contribute to the 
criminal justice system’s failure to deliver 
adequate treatment to individuals with SMI. 
These include the scarcity of mental health 
resources, the rapid turnover of detainees 
in jails, correctional staff with little clinical 
training, and significant increases in the 
criminal justice population, which exceeds 
the capacity of criminal justice organiza- 
tions to implement mental health servic- 
es for all persons who require such care 
(NCCHC, 2002). 


Need for Mental Health 
Screening Tools 


Clinicians use several techniques at intake 
to collect information for formal psychiatric 
evaluations. In general, they examine the 
type and severity of the symptoms (what the 
patient reports) and signs (what the clinician 
observes) of SMI. The two key components of 


the evaluation process are the construction 
of treatment histories, including the use of 
medications, and the performance of mental 
status examinations to evaluate current levels 
of cognitive, social, behavioral, and emotion- 
al functioning. Information gathering tech- 
niques include structured and unstructured 
interviews, symptom questionnaires, patient 
observations, personality inventories, and 
psychological tests. The clinician integrates 
the findings from these various sources in 
order to render a diagnosis and recommend a 
psychiatric treatment and rehabilitation plan 
(Nolen-Hoeksema, 2005). 

The basic psychometric properties of psy- 
chiatric assessment tools are validity and 
reliability. The former refers to the accuracy 
of instrumentation and the latter to the con- 
sistency with which such instrumentation 
is used to collect information. The field 
of psychometrics has established different 
types of validity and reliability for different 
purposes of instrument construction and 
application (Anastasi & Urbina, 1997). Two 
other concepts useful in determining the 
accuracy of an assessment or screening tool 
are sensitivity and specificity. Sensitivity is 
the likelihood that a tool wil! find disease 
among those who have the disease, or the 
proportion of people with disease who have 
a positive test result. Specificity is the likeli- 
hood that a tool will find no disease among 
those who do not have the disease, or the 
proportion of people free of a disease who 
have a negative test result. 

As we suggested in our preceding discus- 
sion, psychiatric assessment requires con- 
siderable time, skill, and training in the use 
of specialized diagnostic tools and criteria. 
Few correctional staff persons have such 
backgrounds or knowledge. Clinicians often 
are hired in correctional settings to perform 
assessments and treatment, but their time 
is usually limited and their caseloads are 
overwhelmingly high (Earley, 2006). Hence, 
it is infeasible and costly to conduct a full- 
blown psychiatric assessment on every per- 
son entering the criminal justice system. 

To conserve sparse clinical time and 
resources, mental health screening should 
be employed; its goal is the identification of 
persons who need further evaluation. How- 
ever, if mental health screening is done at all 
in criminal justice settings, screening tools 
are usually unstandardized, and screening 
procedures vary greatly within and among 
agencies. The validity and reliability of such 


screening results are quite low (Steadman & _ 
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Veysey, 1997). For example, a survey of state 
prisons found most of the responding insti- 
tutions had no valid data on the prevalence 
of various psychiatric conditions (primar- 
ily SMI) in their inmate populations—a 
problem attributable to antiquated informa- 
tion systems as well as to non-standardized 
screening and assessment procedures (Hor- 
nung et al., 2002). 

Although no comparable surveys have 
been conducted of probation departments 
or jails, a similar degree of variation in 
screening and referral practices is likely to 
be found in those settings as well (Lurigio et 
al., 2003; Skeem et al., 2003). As Steadman, 
et al. (2005) noted about screening for men- 
tal health problems in jails: “Screening may 
consist of anything from one or two ques- 
tions about previous treatment to a detailed, 
structured mental health status examina- 
tion” (p. 816). 

Overall, few criminal justice agencies 
employ brief screening tools that are con- 
structed and validated specifically for offender 
populations (Swartz, 2001). Since the inception 
of the war on drugs, criminal justice adminis- 
trators and practitioners have encouraged the 
use of screening and assessment tools to detect 
substance use disorders. However, they have 
paid little or no attention to the development 
of comparable instruments for detecting men- 
tal illness (Peters et al., 2000). 

Many criminal justice agencies construct 
their own psychiatric screening tools, which 
rarely are subjected to rigorous reliabil- 
ity and validity studies and often are based 
simply on face validity, which is the lowest 
level of measurement accuracy. Face valid- 
ity requires only that a tool appears to be 
measuring what it purports to measure. Still 
others rely on probation officers’ subjective 
judgments about mental health needs, which 
tend to grossly underestimate the number of 
probationers with SMI (Lurigio & Swartz, 
2006). The best screening tools have high 
predictive validity: “most of the people who 
are flagged by [the tool] as being positive 
should, on assessment, be found to have a 
treatable serious mental illness” (Steadman, 
et al. 2005, p. 817). 

The screening instrument presently used 
in the CCJ is typical. More than 300 detain- 
ees are processed through reception and 
classification every day at the jail. They are 
screened for psychiatric problems by mental 
health specialists who ask a short series of 
questions about previous psychiatric hospi- 


talizations, current use of psychiatric medi- 


cation, and recent thoughts of suicide. This 
set of questions is consistent with recom- 
mended screening practices for jails and 
prisons and is effective in identifying many 
individuals who require further assessment 
and treatment (American Psychiatric Asso- 
ciation [APA], 1989; Steadman & Veysey, 
1997). Nonetheless, the questions are likely 
to miss numerous other individuals with 
SMI, specifically, those who have no previous 
hospitalizations or current suicidal ideation, 
or who are taking no psychiatric medica- 
tions (Teplin, 1990; Teplin et al., 1997; Teplin 
& Swartz, 1989). Teplin (1990) reported that 
these criteria missed nearly two-thirds of the 
mentally ill detainees who were identified 
with an independently administered and 
standardized clinical instrument. 

As we noted above, screening instru- 
ments for psychiatric disorders have been 
validated on samples from the general popu- 
lation but not on samples from criminal 
justice populations. The accurate estimate 
of the underlying prevalence of a disorder is 
important for clinical decision-making pur- 
poses; for example, a determination of when a 
person’s symptoms suggest that further evalu- 
ation and treatment are warranted (Schmitz 
et al., 2000). Thus, a screening tool for cor- 
rectional settings should be tested on criminal 
justice populations, where the prevalence of 
psychiatric disorders is higher than it is in the 
general population (HRW, 2003). The choice 
of a corrections-based screening tool also 
must consider the conditions that apply to 
screening in criminal justice settings, which 
place constraints on the kinds of screening 
instruments that are best suited for those 
contexts where there is a shortage of clini- 
cal services and treatment slots and a high 
volume of screenings, which are conducted 
by correctional staff persons who have no 
clinical or diagnostic expertise. 

Hepburn (1994) recommended that screen- 
ing instruments for substance use disorders in 
criminal justice settings have standardized 
and replicable scoring criteria that can be 
implemented and interpreted by lay inter- 
viewers. They also should be brief and easy 
to administer without the need for extensive 
training. Screening for psychiatric illnesses 
in a criminal justice setting is subject to the 
same contextual limitations and pressures as 
screening for substance use disorders. There- 
fore, the ideal psychiatric screening device 
would have the same properties as a substance 
abuse screening device. 


Defining the appropriate content of a 
screening instrument for psychiatric dis- 
orders is much more difficult than it is for 
substance use disorders, for two basic rea- 
sons. First, the Diagnostic and Statistical 
Manual of Mental Disorders (DSM-IV-TR), 
Axis I (Clinical Disorders) has 15 general 
categories of non-substance use disorders 
(APA, 2002). It is possible to screen briefly 
for all drugs of abuse or simply for the pres- 
ence of any substance use problem. How- 
ever, it is impossible (at the very least, highly 
unwieldy) to efficiently screen for every Axis 
I psychiatric diagnosis. Second, even if it was 
possible to screen for every DSM-IV, Axis I 
disorder, not everyone with a disorder needs 
treatment. Clinical severity and treatment 
need lie on a continuum; even some persons 
with seemingly severe psychiatric disorders 
are able to function adequately without clin- 
ical intervention (Regier et al., 1998). The 
challenge is to determine which psychiatric 
disorders should be included in a screening 
tool and to define when a disorder is severe 
enough to warrant clinical intervention or 
further assessment. 


Candidate Screening Instruments 


A couple of approaches have been adopted to 
simplify standardized screening procedures 
for psychiatric disorders. The first determines 
if an individual meets the diagnostic crite- 
ria for a few diagnoses that are likely to be 
clinically severe and to require treatment and 
other interventions. This restricted subset 
of all DSM-IV diagnoses typically includes 
those that are widely regarded as SMIs: 
schizophrenia, bipolar disorder, and major 
depressive disorder. Similar to the longer 
assessment instruments for psychiatric diag- 
nosis, such as the Composite International 
Diagnostic Interview (CIDI) (Robins et al., 
1988) from which they were derived, these 
instruments are modular. Each module con- 
sists of a sequence of questions for diagnosing 
a specific disorder or class of disorders. 

Among the tools that adopt a diagnostic 
approach to screening are the Composite 
International Diagnostic Interview - Short 
Form (CIDI-SF) (Kessler et al., 1998), the 
Mini-Neuropsychiatric Interview (MINI) 
(Sheehan et al., 1998), and the Referral Deci- 
sion Scale (RDS) (Teplin & Swartz, 1989). The 
administration time for these relatively brief 
instruments can be shortened further by omit- 
ting modules that screen for disorders that are 
of little interest to clinicians or researchers. 
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Despite their administrative flexibility, 
problems of over- and under-identification 
with this class of instruments limit their use- 
fulness, particularly in resource-constrained 
criminal justice settings in which accuracy is 
crucial. The diagnostic approach to screen- 
ing equates the need for clinical intervention 
with diagnosis. Persons who meet the diag- 
nostic criteria for one or more disorders are 
referred for further assessment and possibly 
treatment. Those who meet none of the full 
criteria for any disorders are not referred for 
either further assessment or treatment. 

The potential drawback of such tech- 
niques is that they miss persons who have a 
severe disorder that is not contained in the 
screening tool (such as post-traumatic stress 
disorder, generalized anxiety disorder) but 
that nevertheless requires clinical interven- 
tion. These errors are called false negatives: 
concluding that a person with mental ill- 
ness has no disorder. Such individuals might 
receive no treatment, while detained or under 
supervision, despite the clinical severity of 
their conditions. They are at risk for prob- 
lems associated with untreated SMI (e.g., 
rearrest, homelessness, violence, substance 
abuse). Moreover, despite the administrative 
flexibility of being able to select the diagnoses 
that are included in the screening, the neces- 
sity of obtaining a DSM-IV diagnosis adds a 
level of complexity to the instruments that is 
attributable to the use of skip patterns and 
question probes. Given the lack of clinical 
and interviewing skills among corrections 
staff, the inclusion of even a small number of 
skip patterns and probes can sharply reduce 
the validity and reliability of the instrument. 

The problems related to using a diag- 
nostic approach to screening for psychiatric 
treatment have recently led to a second 
approach that de-emphasizes diagnosis and 
focuses instead on symptom severity and 
level of impairment (Kessler et al., 2002). 
Although this approach has been discussed 
widely in the literature (see Murphy, 2002), it 
has recently gained currency because large- 
scale epidemiological surveys, such as the 
Epidemiological Catchment Area (ECA) 
study and the National Comorbidity Survey 
(NCS), have found surprisingly high preva- 
lence rates of psychiatric disorders (Kessler 
et al., 1994; Regier et al., 1990). 

In both studies, between 20 and 30 
percent of the general population met the 
DSM-IV criteria for at least one past-year 
Axis I disorder. As it is unlikely that this 
large a proportion of the general population 


required mental health treatment services, 
the findings of these surveys were of limited 
usefulness in guiding federal and state treat- 
ment resource allocations. These findings 
also suggested that screening for symptom 
severity and level of functional impairment 
is a more efficient way of discerning the need 
for psychiatric treatment (see Regier et al., 
1998; Slade & Andrews, 2002). 

The tools related to the symptom-severity 
approach are particularly applicable for use 
in criminal justice settings. Their advantag- 
es include the use of briefer screening instru- 
ments, without skip patterns and probes, 
which makes the screening tool simpler 
to administez. Such screening tools can be 
implemented by lay interviewers to identify 
individuals with the most severe psychiat- 
ric disorders, regardless of diagnosis. This 
approach conserves limited resources for 
only those mentally ill persons most in 
need of services. In other words, such tools 
can avoid false positives, which identify as 
mentally ill persons whose symptoms are 
not severe enough to warrant treatment. A 
low false positive rate is especially important 
in criminal justice settings, in which scarce 
mental health resources must be used spar- 
ingly (Steadman et al., 2005). 


SMI Screening Tools for 
Criminal Justice 


K6/K10. Among the class of instruments that 
take this approach to screening are the K6/ 
K10 scales that, for the reasons we mentioned 
above, appear to be especially promising for 
use with criminal justice populations (Kes- 
sler et al., 2002). Kessler et al. (2002) began 
the tool-building process with a large pool 
of items derived from an extensive battery of 
psychological instruments. They then used 
analytic procedures, based on item-response 
theory, to distill a subset of 10 questions (the 
K10) and a completely overlapping subset of 
6 questions (the K6). 

The K6 asks respondents how often in 
the previous month they felt “nervous,” “so 
sad that nothing could cheer [them] up,” 
“restless or fidgety,” “hopeless,” “everything 
was an effort,” and “worthless.” These ques- 
tions identified with maximum sensitivity 
the individuals who met the following two 
criteria: a past-year diagnosis of any DSM- 
IV Axis I psychiatric disorder and a Global 
Assessment of Functioning (GAF) score 
below 60 (i.e., moderate to severe impair- 
ment in functioning) (see APA, 2002, and 
Endicott et al., 1976). 
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Further calibration of the K6/K10 scales 
was done to select cutoff scores that iden- 
tify individuals above the 90" percentile in 
symptom severity, consistent with estimates 
which suggest that 6-10 percent of the gen- 
eral population needs psychiatric treatment 
services at any given time (Kessler et al., 
2002). In validity studies, the K6 scale has 
performed as well as the K10 in identifying 
individuals with SMI and has become the 
more widely used instrument (Kessler et al., 
2003). The K6 is now included in national 
surveys, such as the National Survey on 
Drug Use and Health (NSDUH) and the 
National Health Interview Survey (NHIS). 

Despite its widespread application in gen- 
eral population studies of psychiatric disor- 
ders, we are the only researchers to validate 
the K6 for use with a criminally involved 
sample of persons. We demonstrated the 
use of the K6 scale with a sample of adults 
who reported an arrest in the past year, and 
compared the classification results of the K6 
screen with those obtained using a common, 
but unvalidated, set of screening questions 
(e.g., receipt of past psychiatric treatment 
services, use of prescribed psychiatric medi- 
cations). Specifically, we compared the diag- 
nostic accuracy of the unvalidated set of 
questions with the K6 classification results, 
and examined the characteristics of par- 
ticipants who are misclassified in order to 
understand why they were incorrectly iden- 
tified by standard criteria. We also examined 
the characteristics of offenders who screened 
positive on the K-6 scale for SMI, compared 
with those who screened negative (Swartz & 
Lurigio, 2005; Swartz & Lurigio, in press). 

We found that nearly 20 percent of the 
1,700 participants with a past-year arrest in 
the 2002 NSDUH sample had a K6 score of 
13 or higher, indicating that in the past year 
they had experienced symptoms of severe 
psychological distress consistent with the 
presence of SMI. Our findings also indicted 
that all the items in the K6 work equally 
well in detecting SMI among arrestees for 
both genders. In addition, we found the 
same pattern of item and scale consisten- 
cy among different racial/ethnic and age 
groups. Respondents with SMI were more 
likely than those with no SMI to report a 
past-year substance use disorder that met the 
DSM-IV criteria for dependence or abuse; 
they also were more likely to have received 
drug abuse and mental health treatment in 
the past year (Swartz & Lurigio, 2005; Swartz 
& Lurigio, in press). 
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Brief Jail Mental Health Screen. The Brief 
Jail Mental Health Screen (BJMHS) was 
derived from the RDS (Teplin & Swartz, 
1989) and consists of eight dichotomous 
questions. The first six ask respondents 
whether they currently believe someone is 
putting thoughts into, or taking them out of, 
their heads; whether they currently believe 
that other people read their minds; whether 
family or friends have noticed that they 
are more active than usual; whether they 
have gained or lost weight for several weeks 
without trying to do so; and whether they 
have currently felt useless or sinful. The last 
two questions ask whether they are cur- 
rently taking any medication for emotional 
or mental health problems and whether they 
have ever been in a hospital for emotional or 
mental health problems. 

To test the concurrent validity of the 
BJMHS, Steadman et al. (2005) studied sam- 
ples of detainees who were and were not 
referred for mental health services based 
on the scale scores. Each of the detainees 
in those groups also was evaluated with the 
Structured Clinical Interview for DSM-IV 
(SCID), which provides a diagnosis-driven 
assessment of mental health problems. In 
this study, the SCID served as the gold 
standard, or the accepted reference or diag- 
nostic test, for psychiatric illness. Results 
showed that the BJMHS correctly classi- 
fied as having a diagnosable mental illness 
nearly 75 percent of the male detainees but 
only 62 percent of the female detainees. The 
researchers concluded that the BJMHS is, 
overall, a practical, simple, and efficient tool 
for psychiatric screening in jails, but that it 
has an unacceptably high false-negative rate 
for women. 


Summary and Conclusions 


Individuals with SMI are over-represent- 
ed in the criminal justice system. Since 
Abramson’s (1972) seminal work on the 
criminalization of the mentally ill, such per- 
sons have become more abundant in terms 
of their absolute numbers and proportion- 
ate representation in correctional popula- 
tions. Their prevalence in jails, prisons, and 
probation caseloads is likely to grow even 
further unless fundamental changes take 
place in the circumstances that contribute 
to their involvement in the criminal justice 
system. However, the historical factors that 
encouraged the criminalization of the men- 
tally ill are still common today: the lack of 
community-based mental health services for 


persons with SMI, the splintering of the drug 
abuse and mental health treatment systems, 
and the legal restrictions on involuntary 
commitment (Lurigio, 2005). 

Persons with SMI often reside in envi- 
ronments where crime and opportunities to 
commit crime are rampant, drug and crime 
enforcement strategies are aggressive, and 
mental health and drug abuse treatment 
programs are limited or inaccessible—ali of 
which increase the likelihood that the men- 
tally ill will be caught in the criminal-justice 
web (Draine, 1993; Fisher, Silver, & Wolff, in 
press). In addition, the co-occurrence of psy- 
chiatric and substance use disorders, which 
is more common in the criminal justice 
population than in the general population, 
elevates the risk of arrest, violent behavior, 
hospitalization, incarceration, recidivism, 
and a host of other adverse life events. The 
failure to treat SMI, substance use disorders, 
and their comorbidity jeopardizes public 
safety, promotes recidivism, and can result 
in legal liability for criminal justice depart- 
ments that are unresponsive to clients’ needs 
for mental health services (Lurigio, 2003). 

Effective screening is the first step in 
properly addressing the behavioral health- 
care problems of the mentally ill in the 
criminal justice system. Mental health 
resources within agencies and communities 
are scanty; therefore, accurate identification 
for direct service provision or referral is 
paramount. As we have noted in this article, 
corrections professionals typically have no 
background, training, or experience in the 
assessment or treatment of mental illness. 
Moreover, information on treatment history 
and current use of psychiatric medication is 
not always an accurate indicator of a current 
mental health problem. Similarly, subjective 
judgments regarding mental health treat- 
ment need are fraught with errors and lead 
to inconsistent and biased decisions among 
probation officers and other professionals, 
including those with expertise in the mental 
health field (Lurigio & Swartz, 2006). 

We recommend the use of the K-6 as a 
promising screening instrument for crimi- 
nal justice populations. The BJMHS is also 
a viable option for such purposes but should 
be further refined to improve its specificity 
with women offenders. For both tools, more 
validation studies should be conducted to test 
their preciseness with different correctional 
populations. In addition, the construction 
of mental health screening tools for cor- 
rectional clients should focus on identifying 


persons with comorbid psychiatric and sub- 
stance use disorders. Such a screening tool 
is presently being developed as part of the 
large-scale, federally funded project known 
as the Criminal Justice-Drug Abuse Studies 
Initiative (CJ)-DATS), which is funded by the 
National Institute on Drug Abuse (Sacks, 
Melnick, & Coen, 2005). Finally, feasibility, 
time, and resource studies must be under- 
taken to examine the effects of widespread 
mental heath screening in criminal justice 
settings. The bottom line is whether screen- 
ing actually results in better outcomes for 
individuals with SMI who are involved in 
the criminal justice system. 
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IN BOTH RESEARCH and practice, the 
past two decades have produced consider- 
able developments in the field of specialized 
risk assessment for sexual offenders. Dozens 
of studies have contributed to a growing 
evidence base regarding recidivism risk 
factors and the potential efficacy of treat- 
ment interventions (Hanson & Bussiere, 
1998; Hanson & Morton-Bourgon, 2004). 
On the basis of these findings, a broad array 
of specialized actuarial and guided clinical 
assessment instruments have been intro- 
duced and continue to be tested and refined 
(Doren, 2004a). 

These developments have been fueled in 
part by resurgent policy attention to the issue 
of sex offender management during the past 
15 years. Prompted in part by federal legisla- 
tion, registration and community notification 
laws have been adopted nationwide, calling 
for effective systems of classifying levels of 
risk (Adams, 2002). Since 1990, seventeen 
states have adopted civil commitment laws 
for sexual predators—policies predicated on 
predictions of future violence and increas- 
ingly requiring states to address the complex 
issues associated with an individual’s suit- 
ability for community release (Harris, 2005). 
The emergence of specialized models for 
community supervision of sexual offenders 
has demanded effective means of applying 
risk assessment in a multi-disciplinary con- 
text (English, Pullen, & Jones, 1997). The 
introduction of risk-based sentencing systems 
has produced unique demands for evidence- 
based decision tools that can both inform the 
sentencing process while ensuring due pro- 
cess (Kern & Farrar-Owens, 2004). As of this 
writing, legislative activity shows few signs 


of slowing, with the issue of sex offenders 
remaining at the top of state legislative crime 
control agendas (National Conference of State 
Legislatures, 2006). 

While increased policy focus on these 
issues has produced both a significant 
expansion of the evidence base and the 
increased attention of researchers, it has 
also presented the burgeoning field of sex 
offender risk assessment with considerable 
challenges. The often overlooked hetero- 
geneity of the population to be managed, 
coupled with the diverse range of organiza- 
tional and programmatic contexts in which 
risk assessment is required, produces con- 
siderable potential for over-generalization of 
particular findings or the misapplication of 
particular tools or approaches. 

As our policies have evolved, sex offender 
risk assessment has been called upon to 
respond to the needs of multiple stakehold- 
ers and to meet a wide range of legal, forensic, 
and clinical purposes. Treatment profession- 
als use it to develop treatment plans or evalu- 
ate progress. Probation and parole personnel 
use it to establish suitability for community 
supervision, case management, and inter- 
vention. The courts apply it for purposes of 
civil commitment or criminal sentencing. 
Law enforcement has adopted it for purposes 
of profiling, investigation, or designation of 
sex offender risk levels for purposes of regis- 
tration and community notification. 

Considering this diversity of contexts, 
it would appear that one-dimensional 
“debates” over the relative merits of par- 
ticular approaches to risk assessment (e.g. 
clinical vs. actuarial approaches) may miss 
an essential part of the picture. Indeed, the 
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key to finding “middle ground” between 
competing viewpoints may rest primarily 
in achieving greater clarity of our goals and 
objectives, and in adapting our methods 
and practice to meet those goals (Dvoskin & 
Heilbrun, 2001). 

Consistent with this view, this article 
aims to present the discussion regarding 
sexual offender risk assessment in a circum- 
scribed programmatic context, with specific 
focus on the practice of community-based 
supervision of sexual offenders. 

This article consists of two parts. The first 
reviews the current state of sex offender risk 
assessment, considering the factors known 
to be associated with sexual recidivism and 
the methods currently utilized to trans- 
late those factors into risk assessment prac- 
tice. The article’s second section applies this 
understanding to the specific programmatic 
context of community-based supervision of 
sexual offenders, and sets forth a frame- 
work for integrating current risk assessment 
knowledge into systems of community-based 
supervision of sexual offenders. 


Sexual Offender Risk 
Assessment—The State of 
the Field 


As noted in the introduction, advances in the 
field of specialized sex offender risk assess- 
ment accelerated greatly beginning in the 
1990s. While this development was spurred 
in part by advances in general violence risk 
assessment research, it was greatly facilitated 
by a range of critical developments in the 
realm of public policy surrounding sexual 
offenders (Hanson, 2005). On the legisla- 
tive front, these developments included the 
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spread of sex offender registration and com- 
munity notification laws across the nation 
and the passage and implementation of civil 
commitment laws for sexually violent preda- 
tors. Concurrently, with growing national 
emphasis on issues surrounding prisoner 
re-entry and community corrections, juris- 
dictions across the country were expanding 
and refining a range of specialized commu- 
nity supervision models for sexual offenders 
(English et al., 1997). 

The common thread running through each 
of these diverse policy strategies is the shared 
goal of reducing recidivism among individu- 
als previously convicted of sexual offenses. 
Consistent with this goal, policymakers, 
researchers, and practitioners have focused 
increasing attention on three main areas: 


e The identification of risk factors associ- 
ated with sexual recidivism; 

e The integration of those risk factors into 
structured assessment instruments; and 

e The refinement of interventions aimed at 
reducing re-offense rates. 


This section addresses the first two areas, 
setting the stage for a later discussion of the 
implications of risk assessment for guiding 
community supervision interventions. 


Risk Factors for Sexual Re-Offense 


In 1998, Hanson and Bussiere published a 
meta-analysis of 61 studies providing infor- 
mation on 28,972 sexual offenders and inves- 
tigating risk factors associated with sexual 
recidivism (Hanson & Bussiere, 1998). This 
was followed in 2004 by an updated meta- 
analysis encompassing the initial studies 
plus additional research conducted between 
1998 and 2003, accounting for a total of 95 
studies and over 31,000 sexual offenders 
(Hanson & Morton-Bourgon, 2004). 

The studies found aggregate sexual re- 
offense rates (based on average follow-up 
periods of 5-6 years) of 13.4 percent and 
13.7 percent, respectively. The earlier study 
further differentiated sexual re-offense 
rates between child molesters (12.7 percent 
re-offending) and rapists (18.9 percent re- 
offending). The studies also reported on 
rates of re-offense related to non-sexual vio- 
lent crimes, finding overall re-offense rates 
of 12.2 percent and 14 percent (9.9 percent of 
child molesters and 22.1 percent of rapists). 


Identified Risk Factors 


While the later study included some notable 
additional findings, the two analyses were 


fairly consistent in their overall conclusions 
regarding the major predictors of long-term 
recidivism. 

Topping the list in both studies was the 
presence of certain types of sexual deviancy, 
as measured by both phallometric assessment 
and deviant sexual preferences as measured 
by standardized tools or clinical records. 
The Hanson and Bussiere study found sexu- 
al interest in children (i.e., pedophilia) to be 
a strong predictive factor in child molesters, 
although it did not find sexual interest in rape 
to be a significant predictor among rapists. 
The later study confirmed this finding, add- 
ing the existence of other paraphilias (such 
as exhibitionism and voyeurism) as hav- 
ing additional predictive value. The study 
further cautioned that the lack of findings 
regarding an association between sexual re- 
offense and paraphilic interest may be due 
to a limited number of studies that investi- 
gated this association, and suggested further 
research in this area. 

The second most dominant factor iden- 
tified in both studies involved the pres- 
ence of antisocial lifestyle and orientation, 
as characterized by “rule violations, poor 
employment history, and reckless, impul- 
sive behavior.” (Hanson & Morton-Bourgon, 
2004). Notably, in contrast with the sexual 
deviance variables, this factor has been con- 
sistently found to serve as a strong predictor 
of general recidivism in non-sexual crimi- 
nals (cite Bonta et al. 1998). Considering 
this, some have questioned whether similar 
mechanisms are at work in sexual and non- 
sexual offenders, or if antisocial orientation 
interacts with other predictors to create a 
unique dynamic among sexual offenders. 

Beyond these first two major factors—sex- 
ual deviance and antisocial orientation—the 
meta-analyses identified a range of addi- 
tional factors established as having moderate 
predictive value. These factors included: 


e Age (younger offenders presenting 
higher risk); 

e Number of prior offenses; 

e Single marital status; 

e Treatment failure; 

e Sexual preoccupations; and 

e Intimacy deficits. 


Methodological Considerations 


Despite the growing base of knowledge 
related to risk factors for sexual recidivism, 
research in this area has been constrained 
by a range of methodological issues. Most 
of these issues relate, in one way or another 


to the base rate of sexual offending—i.e. the 
proportion of individuals within the popula- 
tion who eventually re-offend. 

As noted above, the aggregate base rate 
for sexual offender as established by stud- 
ies included in meta-analyses is somewhere 
between 13 percent and 14 percent. For 
a variety of reasons, however, this figure 
most likely underestimates the “true” rate of 
sexual offending, and additionally does not 
effectively capture the range of variation in 
this rate across subsets of the sex offender 
population. Issues commonly associated 
with the base rate include: 

Under-reporting—The vast majority of 
studies addressing the issue of sexual recidi- 
vism operationalize re-offense as incidents 
that are detected and lead to arrest and con- 
viction. It is fairly well-established that only 
a limited proportion of sexual crimes—per- 
haps fewer than one in three—are report- 
ed to the police (Hart & Rennison, 2003). 
Accordingly, it is likely that actual re-offense 
rates may be substantially higher than those 
captured by recidivism researchers. A relat- 
ed confounding factor relates to the fact 
that the extent of this under-estimation may 
not be uniform across groups of offenders, 
considering that offenders with certain char- 
acteristics (such as higher intelligence) may 
simply be more adept at avoiding detection. 

Population heterogeneity—Sexual oifend- 
ers are an extremely diverse group. Beyond 
the fundamental distinction between rapists 
and child molesters, each of these groups 
includes a wide range of subtypes linked 
to victim choices, underlying motivations, 
behavioral patterns, and other factors (Knight 
& Prentky, 1990; Lanning, 1986). This het- 
erogeneity has a range of implications for 
both research and practice. From a method- 
ological vantage, failure to effectively distin- 
guish between these subgroups in research 
designs complicates the capacity to conduct 
within-group analysis, especially with those 
groups that are under-represented in samples 
or those with low overall base rates. In terms 
of application, this diversity of the offender 
population is generally not acknowledged in 
commonly used actuarial tools, leading some 
to question the validity of these instruments 
as means of predicting violence in individual 
cases (Hart, Webster, & Menzies, 1993). 

Timeframes—For reasons of resources and 
practicality, studies employ a wide range of 
follow-up periods in their assessment of recidi- 
vism. Although the studies included in Han- 
son and Bussiere’s meta-analysis involved an 
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average follow-up period of five years, evidence 
suggests that the risk of re-offense may extend 
far beyond this threshold (Hanson, Stey, & 
Gauthier, 1993). Hence, the research time hori- 
zon must be viewed as a source of potential 
bias in the derivation of the base rate. 

Statistical Significance—From a research 
standpoint, the most immediate implica- 
tion of a relatively low observable base rate 
involves researchers’ reduced capacity to 
draw statistically significant conclusions 
from available data. While this may be miti- 
gated in part by increasing the sample size, 
many studies are limited in their capacity 
to expand their samples due to resource or 
logistical constraints. 

Diminished Predictive Value—The pre- 
dictive capacity of actuarial risk assessment 
instruments is directly influenced by the 
base rate upon which that instrument has 
been based and validated—the lower the 
base rate, the higher the probability of error. 
In general, low base rates are most likely to 
increase the probability of “false alarms.” 


Static and Dynamic Factors 


Reviewing the major variables known to be 
most closely associated with long-range sexual 
recidivism, many have noted that the majority 
of these factors are either static or highly stable 
in nature. While we have developed a fairly 
good sense of these immutable case character- 
istics that might place certain individuals at 
higher risk of re-offense than others, we have 
a much more limited understanding of the 
influence of dynamic characteristics associ- 
ated with sexual recidivism risk (Craissati & 
Beech, 2003; Hanson & Harris, 2000b). 

Certainly, the use of static variables in an 
applied context carries some clear advan- 
tages. Beyond their long-range predictive 
value, they are comparatively easily acces- 
sible through official records, and generally 
involve little or no subjective judgment. Yet 
over time, these static predictors gradually 
lose their utility for the majority of offenders 
under community supervision. Hence, effec- 
tive systems of community supervision may 
begin with an understanding of an individ- 
ual’s general risk as predicted by static vari- 
ables, but ultimately depend on our capacity 
to identify and respond to changes in risk 
levels over time. 

Dynamic variables may be viewed in terms 
of stable and acute factors (Gendreau, Little, 
& Goggin, 1996; Hanson & Harris, 2000b). 
Stable factors are those mutable characteris- 
tics of the individual that may change over 


time, but are not generally subject to short- 
term fluctuations. Key stable dynamic factors 
include variables such as cognitions, insight, 
treatment compliance, and attitudes related 
to offending behaviors. Acute factors reflect 
case characteristics that may change over 
more limited periods of times—in some cases 
weeks, days, or even hours. These factors 
might include both short-term life changes 
in domains such as employment, residence, 
or relationships, and immediate conditions 
such as intoxication or circumstances that 
may provide access to potential victims. 

Research regarding the effects of dynamic 
variables on sex offense recidivism remains in 
a developmental state. While methodological 
limitations have constrained much research 
in this area, key dynamic factors that appear 
to be related to recidivism include social 
adjustment, attitudes towards victims, self- 
awareness regarding risk to recidivate, victim 
access, and cooperation with supervision and 
treatment (Hanson & Harris, 2000b). The 
results of the Dynamic Supervision Project, 
a five-year longitudinal study currently fol- 
lowing 1,000 offenders under community 
supervision in Canada, Alaska, and Iowa, 
may eventually provide further perspective 
on these factors (Harris & Hanson, 2003). 


Approaches to Sex Offender Risk 
Assessment 


Hanson (2002) cites three potential approach- 
es to sexual offender risk assessment—pure 
actuarial approaches, which make predic- 
tions based on survey instruments that 
leave no room for subjective interpretation; 
guided clinical approaches, which rely on 
the systematic professional judgment of 
qualified professionals based on empirically- 
derived instruments; and adjusted actuarial 
approaches in which professional judgment 
is superimposed on actuarial scores. To these 
options, we might add a fourth plausible 
approach—the use of unstructured clinical 
judgment to determine risk. 

Doren (2004a) identifies over 20 instru- 
ments that have been applied in the assess- 
ment of sex offender risk. These instruments 
are varied—some have been developed as a 
means of evaluating the potential for gen- 
eral violence risk (Quinsey, Harris, Rice, & 
Cormier, 1998; Webster, Douglas, Eaves, & 
Hart, 1997), while others have been geared 
towards identifying the risk specifically for 
sexual offenders (Epperson et al., 1999; Han- 
son, 1997; Hanson & Harris, 2000a; Hanson 
& Thornton, 1999; Hart, Kropp, & Laws, 


2004). Some are pure actuarial tools that 
present a fairly one-dimensional perspecti''e 
on an individual’s relative risk level (Han- 
son, 1997) while others are designed to be 
utilized as support systems for more com- 
prehensive clinical determinations (Hart et 
al., 2004). Some rely solely on static variables 
(Hanson & Thornton, 1999), while others 
integrate dynamic predictors on a limited 
(Epperson et al., 1999) or exclusive (Hanson 
& Harris, 2000a) basis. 

Following a brief review of some of these 
instruments, we will consider the relative 
utility of the various approaches. 


Actuarial Assessment 


Several specialized actuarial instruments 
for the prediction of sexual re-offense have 
emerged during the past decade. The actu- 
arial approach, in a nutshell, gathers a series 
of variables believed to have predictive valid- 
ity, applies relative weights to each variable, 
and combines these data into an aggregated 
risk score and classification. 

One widely used instrument is the 
Rapid Risk Assessment for Sexual Offense 
Recidivism, known as the RRASOR (Han- 
son, 1997). The RRASOR is notable for its 
brevity and ease of use—it consists of only 
four variables, all of which can be easily 
pulled from official records. These four fac- 
tors—prior sexual offenses, extra-familial 
victims, offender age under 25, and male 
child victims—were identified for use in the 
scale based on research indicating a strong 
correlation between these factors and risk 
of re-offense. While demonstrating moder- 
ate predictive accuracy, the RRASOR omits 
several variables shown to have particu- 
larly high correlations with re-offense risk, 
including deviant sexual preferences, antiso- 
cial orientation, and treatment compliance. 

A second commonly used tool, the Static- 
99, addresses some of these shortcomings by 
combining the RRASOR with a second scale, 
the Structured Anchored Clinical Judgement- 
Minimum. Beyond the variables contained 
in the RRASOR, the Static-99 considers a 
range of additional factors including sexual 
deviance, range of available victims, per- 
sistence, and a pattern of antisocial behav- 
iors (Hanson & Thornton, 1999; Hanson & 
Thornton, 2000). In a comparative review, 
the Static-99 has been demonstrated to add 
to the predictive accuracy of the RRASOR in 
the measurement of long-term risk potential 
(Hanson & Thornton, 2000). 


5 
ke : 
pitts, 
} 
} 
| 
5 


September 2006 


RISK ASSESSMENT AND SEX OFFENDER SUPERVISION 39 


A third instrument—the Sex Offender 
Risk Appraisal Guide (SORAG) (Quinsey et 
al., 1998)—measures a different, although 
likely closely related group of factors com- 
pared to the Static-99. This scale, adapt- 
ed from a general violence prediction tool 
known as the VRAG, is notable for its inte- 
gration of psychiatric and psychological 
variables, including psychopathy and mental 
illness diagnoses. Its relative predictive value 
appears comparable to the Static-99 in the 
prediction of sexual recidivism, and appears 
to more effectively predict non-sexual vio- 
lent recidivism (Hanson & Thornton, 2000). 

Beyond these instruments designed for 
general use, some states have developed 
customized instruments, generally under 
the auspices of a state agency, designed for 
specific uses. The Minnesota Sex Offender 
Screening Tool (MnSOST) was originally 
developed in the early 1990s by the Minne- 
sota Department of Corrections as a means 
of codifying factors viewed to place an indi- 
vidual at high risk for re-offense (Huot, 
1999). Revised to the MnSOST-R in 1996 
(Epperson et al., 1999), the tool was explicitly 
designed to be used by non-clinical staff. 
Research on the MnSOST has demonstrated 
moderate predictive capacity, comparable to 
other commonly used actuarial instruments 
(Barbaree, Seto, Langton, & Peacock, 2001; 
Hanson & Morton-Bourgon, 2004). 

Finally, the SONAR (Sex Offender Needs 
Assessment Rating) was designed in 2001 as 
an actuarial tool based on dynamic variables 
(Hanson & Harris, 2000a). Viewed as an 
adjunct to actuarial instruments based on 
static factors, the SONAR captures informa- 
tion across both stable and acute dimensions. 
Stable factors include intimacy deficits, neg- 
ative social influences, attitudes toward sex 
offending, and self-regulation. Acute factors 
include substance abuse, negative moods, 
anger, and victim access. The SONAR was 
subsequently adapted into two scales—the 
STABLE 2000 and the ACUTE 2000. These 
scales, combined with the Static-99, form the 
basis for a blended approach toward commu- 
nity supervision designed to capture long- 
term, intermediate, and short-term factors 
associated with sexual recidivism (Harris & 
Hanson, 2003). 


Structured Clinical Decision Tools 


In contrast with actuarial instruments, which 
contain explicit rules for weighting each vari- 
able, structured clinical assessment guides 
the evaluator to consider a range of empirical- 


ly validated risk factors, which the evaluator 
then assesses for a general estimate of risk. 
One example of a structured clinical 
decision tool is the Sexual Violence Rating 
Scale (SVR-20) (Boer, Hart, Kropp, & Web- 
ster, 1997). Applying a similar approach to 
the HCR-20—a tool used to structure clini- 
cal decisions regarding the risk of general 
violence (Webster et al., 1997)—the SVR-20 
encompasses twenty variables that are dis- 
tributed into three broad domains. These 
domains include psychosocial adjustment 
(encompassing factors such as sexual devi- 
ance, history of childhood sexual abuse, 
psychopathy, relationship problems, employ- 
ment instability, and offending history); the 
nature of sexual offending (such as levels 
of violence employed, escalation in offense 
severity, and attitudes toward offending 
behaviors); and future plans (i.e., responses 
to interventions). One notable character- 
istic of tools such as the SVR-20 lies in 
their potential to capture and integrate the 
individual’s responses and reactions to treat- 
ments and interventions. The SVR-20 has 
recently been modified into a new instru- 
ment known as the Risk for Sexual Violence 
Protocol (RSVP) (Hart et al., 2004). 


Comparing the Approaches 


The research on sexual offense recidivism 
has focused primarily on issues of long-term 
risk. Bounded by certain methodological 
limitations, this research has been highly 
focused on static factors that have been dem- 
onstrated to be associated with the probabil- 
ity of future sexual offenses. 

In the context of these circumstances, 
it is not terribly surprising that actuarial 
assessment has carried the day. Research on 
the comparative ability of these approaches 
to predict general recidivism in a popula- 
tion of sexual offenders has found actuarial 
assessments to be most accurate, followed by 
guided clinical approaches, then by unstruc- 
tured clinical judgment (Hanson & Bussiere, 
1998; Hanson & Morton-Bourgon, 2004). 

Regarding adjusted actuarial methods, 
actuarial “purists” argue that the track 
record of clinical judgment is so poor that 
scores on validated instruments should not 
be tainted with any subjective interpretation 
(Quinsey et al., 1998). However, some have 
suggested that the proven superiority of such 
adjusted approaches in other domains (nota- 
bly weather forecasting) suggests that the 
adjusted actuarial approach may represent a 
promising approach (Monahan & Steadman, 


1996; Swets, Dawes, & Monahan, 2000). To 
date, however, little or no empirical evidence 
has emerged testing this premise (Hanson & 
Morton-Bourgon, 2004). 

Yet before concluding that actuarial meth- 
ods represent a uniformly superior means of 
risk assessment, one must also recognize 
certain limitations to the actuarial approach. 

First, on a purely conceptual level, many 
have questioned the validity of making case- 
level determinations on the basis of tools 
that have been derived solely from popu- 
lation-based probabilities. This issue has 
arisen particularly in the context of sexually 
violent predator civil commitment decisions. 
Accordingly, some have argued that sole reli- 
ance on actuarial instruments risks missing 
important clinical information that can aid 
significantly in prediction (Hart, 1998). 

Second, the most widely employed actu- 
arial scales are implicitly one-dimensional in 
nature. Conceivably, individuals with entire- 
ly different constellations of risk factors may 
be classified at similar levels of risk—a fact 
that may obscure important distinctions for 
purposes of service planning and supervi- 
sion. In this sense, actuarial scales’ reliance 
on measures of cumulative risk obscure 
important case characteristics that may indi- 
cate an elevated risk. While some have called 
for multi-dimensional models drawing from 
multiple actuarial instruments (Doren, 
2004b), evidence to date has not lent support 
to such an approach (Seto, 2005). 

Ultimately, the relative superiority of one 
method or another is highly dependent on 
the questions that we are asking. If our 
primary concern deals with the aggregated 
long-term risk posed by a group of individu- 
als, actuarial instruments almost certainly 
provide the most valid means of assessing 
such risk. If we are concerned with setting 
forth the relative probability that a par- 
ticular individual will re-offend at some 
undetermined point in the future, actuarial 
instruments provide a moderate degree of 
accuracy, albeit one prone to errors. 

Yet as soon as we turn to different types 
of questions, the relative utility of currently 
available actuarial instruments dissipates 
considerably. Under what circumstances 
would this person be most likely to re- 
offend? What is the probable timeframe of 
re-offense? How has this person’s re-offense 
risk been mitigated by our interventions? 
What is the probable impact of treatment 
and supervision? While work continues on 
actuarial approaches that might eventually 
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answer some of these questions, these issues 
simply cannot be adequately addressed by 
current actuarial methods. 

Considering these factors, the remainder 
of this article is grounded on three funda- 
mental premises regarding the clinical-actu- 
arial distinction—first, that any discussion 
regarding the relative merits of clinical vs. 
actuarial approaches cannot occur in a vac- 
uum, and must be placed in its appropri- 
ate programmatic and operational context; 
second, that, while the clinical-actuarial 
distinction is important from a theoretical 
perspective, and while some circumstances 
call for orthodox adherence to one of the 
two methods, the majority of sex offender 
management practice calls for operating on 
a “middle ground” that draws from both 
approaches; and third, that the clinical- 
actuarial continuum is only one dimension 
within a broader practical framework that 
integrates a range of related constructs. 


Risk Assessment and 
Community Supervision 
Practice: A Framework 


Having reviewed the existing state of sexu- 
al offender risk assessment knowledge and 
practice, we now turn to the fundamen- 
tal question presented at the outset of this 
article—how can risk assessment systems 
and methods be effectively aligned with the 
specific goals and challenges of community 
supervision practice? 

As noted earlier, the risk assessment meth- 
ods to be applied in a given situation are high- 
ly dependent on the specific questions that 
need to be answered. The variability of these 
questions may be viewed in terms of orga- 
nizational demands, in accordance with the 
distinct information needs of central man- 
agement, unit supervision, and line staff; in 
terms of case level demands, recognizing the 
significant heterogeneity of the sex offender 
population in terms of offense type and sever- 
ity, motivations, and associated levels of risk; 
and in terms of temporal demands, noting 
that the challenges associated with managing 
a particular case change over time. 

Figure 1 presents a multi-dimensional 
framework that aims to address some of these 
sources of variability, integrating the risk 
assessment concepts described earlier with 
the range of challenges associated with the 
community supervision of sexual offenders. 


FIGURE 1: 
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The framework consists of two sets of elements—a series of key dimensions, denoted by the five 
black bands, and a series of domains, contained within the grey shaded box. 


Key Dimensions 


The five noted dimensions, described below, 
are not intended to be categorical in nature— 
rather, each should be viewed as a part of a 
continuous spectrum of choices that must 
be made in conjunction with the process of 
community supervision. 


Dimension #1: Primary Orientation 


The model begins by framing the distinction 
between a nomothetic (i.e. population-based) 
approach and an idiographic (i.e. individual- 
based) approach as a framework with which 
to understand the appropriate (and in turn 
the inappropriate) application of risk assess- 
ment to sexual offenders. Under a nomothetic 
orientation, decisions are driven exclusively 
by population-based probabilities based on 
empirically validated systems of evaluating 
risk. Evidence-based policy and practice in 
this arena involves the assessment of the 
populations of concern using actuarial meth- 
ods that rely primarily on static or highly 
stable predictors of re-offense. Under an 
idiographic orientation, decisions are based 
on case-specific attributes based on circum- 
stances presented during a particular point in 
time. Evidence-based practice in this instance 
depends on the far less developed area of 
research into dynamic predictors of sexual 
recidivism, and generally employs informed 
practitioner judgment. 


Dimension #2: Risk Emphasis 


Heilbrun (1997) distinguishes between pre- 
diction-oriented styles and management- 
oriented styles of communicating risk. 
Prediction-oriented risk assessments are 
appropriately applied in contexts that explic- 
itly call for understanding general likelihood 
of an event occurring at some undefined 
point in the future. Conversely, manage- 
ment-oriented approaches are more suited 
to the ongoing task of understanding and 
managing risk at the case-level. Dvoskin 
& Heilbrun (2001) associate the predictive 
orientation with actuarial methods and the 
management orientation with clinical meth- 
ods, suggesting this distinction as a means 
of bridging the divide in the actuarial vs. 
clinical debate. 


Dimension #3: Risk Factors 


The framework presents risk factors as a 
spectrum covering three general domains— 
static factors, stable (dynamic) factors, and 
acute (dynamic) factors (Hanson & Harris, 
1998). As a matter of practice, there is a 
strong rationale for viewing these factors as 
a continuum rather than as discrete catego- 
ries. While some factors such as historical 
variables are by definition immutable, other 
“static” variables might straddle the domains. 
For example, whether psychopathy falls in 
the category of a static, immutable category 
or a highly stable (but ultimately changeable) 
personality characteristic remains open for 
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debate. The precise boundaries between sta- 
ble and acute dynamic factors, often loosely 
defined by issues of timing or magnitude, 
may be similarly unclear. 


Dimension #4: Primary Methods 


The actuarial-clinical dimension, as the other 
dimensions contained within the framework, 
is viewed as a spectrum of alternatives, rath- 
er than an “either-or” proposition. At the far 
ends of this spectrum, the specified method 
is framed as the predominant (although not 
necessarily exclusive) means of gathering 
salient and valid information. In the middle 
of this spectrum, the framework considers 
blended approaches integrating both actu- 
arial and bounded practitioner judgment as 
the most effective means of assessing risk. 


Dimension #5: Frequency of Assessment 


The final dimension—the frequency of risk 
assessment processes—represents a critical 
operational issue related to planning and 
implementation of community supervi- 
sion systems. Baseline assessments based 
on exclusively static variables, by definition, 
tend not to require repeated administration, 
presenting minimal burden on operations 
and resources. Similarly, periodic structured 
assessments to gauge gradual change in rel- 
atively stable case characteristics can be 
integrated into regular work processes with 
predictable impact. The ongoing demands 
associated with identifying and respond- 
ing to imminent risk, however, present a 
wide range of operational challenges asso- 
ciated with issues such as communication, 
surveillance systems, and staff workloads. 
This factor may also be viewed as a signifi- 
cant potential operational impediment to 
the introduction of actuarial methods as a 
means of assessing acute risk. 


Policy and Practice Domains 


Having briefly considered the general dimen- 
sions and the relationships between them, 
our next step is to apply these dimensions to 
the specific challenges associated with the 
community supervision of sexual offenders. 
The figure’s primary columns, denoted by 
the shaded boxes, divide these challenges 
into four general domains—policy and man- 
agement, baseline planning, case manage- 
ment, and acute intervention. 


Policy and Management 


The policy and management domain encom- 
passes the actions and decisions of orga- 


nizational leadership within parole and 
probation agencies. It may also, under cer- 
tain circumstances, encompass the actions 
of legislators charged with the crafting of 
public policies associated with community- 
based sex offender management. 

Although actors within this domain gen- 
erally operate independently of case-level 
decisions, reliable and valid data regard- 
ing the risk levels presented by individuals 
under agency supervision emerge as vital 
management indicators—indicators that 
affect such matters as the formulation of 
policies and procedures, the allocation of 
resources, organizational strategy, quality 
management, and program design. 

At the policy and management level, 
effective decision-making depends largely on 
the maintenance of a nomothetic perspective 
centered upon population-based indices and 
patterns. The information required to make 
key decisions in this domain emphasizes the 
prediction of general risk within the popu- 
lation, rather than the specific risk posed 
by individual cases. Consistent with these 
goals, actuarial assessments driven by static 
characteristics within the population gener- 
ally provide appropriate levels of informa- 
tion to inform decisions regarding resources 
and organizational strategy. 


Baseline Planning 


At the individual case level, one of the 
initial tasks faced by community supervi- 
sion agencies involves the establishment of 
baseline levels of risk. This assessment may 
occur as part of a Pre-Sentence Investigation 
(PSI) process, as part of a prisoner re-entry 
plan, or as part of the agency’s classifica- 
tion process, often in collaboration with law 
enforcement, correctional authorities, and 
treatment providers. 

Within this domain, line managers and 
staff are required to make a range of decisions 
associated both with the initial terms of pro- 
bation or parole and with the allocation of 
often limited resources. Who is appropriate 
for lifetime or intensive supervision? What 
special conditions and restrictions need to 
be placed on each individual? What are each 
individual’s treatment needs and potential 
responsiveness to treatment? Whose risk 
might be mitigated with access to ancillary 
services such as substance abuse treatment, 
employment programming, mental health 
services, or residential programming? 

Within this domain, prediction-oriented 
assessments provide case managers with a 


baseline assessment of an individual’s general 
risk. Such general predictions may inform 
such decisions as development of initial case 
plans and the resource-intensity of supervi- 
sion. Over time, however, prediction-oriented 
notions of risk gradually lose their relative 
utility to those charged with individual super- 
vision, giving way to a significant demand for 
management-oriented approaches. 


Case Management 


In contrast with the baseline planning stage, 
the case management domain shifts the 
emphasis from the realm of prediction into 
the realm of management. While baseline 
risk levels provide highly relevant context to 
ongoing service planning and risk manage- 
ment, the greater concern becomes the flow 
of information regarding changes in the 
offender’s psychological, social, or environ- 
mental status. Are insights and attitudinal 
adjustments being gained in treatment? Has 
the individual managed to maintain rela- 
tionships, employment, and housing? Does 
the process of community integration seem 
to be succeeding? 

The answers to these types of questions 
carry a range of implications for both the 
agency and for the individual case. At the 
agency level, they help to prioritize the 
assignment and allocation of resources, and 
provide potentially valuable information to 
managers regarding the efficacy of inter- 
ventions. At the case level, they provide 
supervision staff with vital data relevant 
to adjustments in service plans, expansion 
or contraction of terms and conditions, or 
identification of emergent needs. 


Acute Intervention 


In contrast with the case management 
domain, in which programmatic adjust- 
ments are made based on gradual evolution 
of circumstances, decisions in this domain 
are concerned with short-term changes in 
psychological, social, or environmental con- 
ditions that might presage offending behav- 
ior. This domain’s primary concern is based 
on one central question—namely, when is an 
individual at imminent risk of re-offending? 

By necessity, this domain focuses on the 
unique characteristics of the individual case, 
and accordingly falls at the idiographic end 
of the spectrum. While knowledge of the 
individual’s general risk level might provide 
useful context, general predictions of the 
person's probability of re-offense are far less 
salient than information that will identify 
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factors associated with pending re-offense 
and, in turn, inform appropriate interven- 
tion. Accordingly, the static risk factors that 
might have contributed to this individual’s 
baseline risk assessment carry relatively little 
practical utility when compared to time- 
specific situational factors such as access to 
potential victims (i.e., opportunity), relapse 
into drug or alcohol use, lapses in compliance 
with terms of supervision, and stressors such 
as the loss of a job, home, or relationship. 

In contrast with domains towards the 
other end of the spectrum, assessment 
methods within this domain remain highly 
dependent on bounded professional judg- 
ment. While actuarial systems for evaluating 
changes in acute dynamic factors among 
sexual offenders remain under development 
(Hanson & Harris, 2001), their efficacy and 
utility have not been fully explored. Accord- 
ingly, given the current state of knowledge, 
the assessment of acute dynamic factors 
remains largely dependent on practitioner 
judgment supported by effective training, 
protocols, and systems of communication. 


Addressing Population Variation 


The framework presented above suggests 
that risk assessment methods must adapt 
to variation across organizational processes 
and functions. Equally important, however, 
these methods must respond to another 
critical source of variation, specifically that 
related to the population under supervision. 

The heterogeneity of the sex offender 
population is well established, and has been 
delineated in a range of typologies developed 
in both the clinical and law enforcement 
context. These typologies have identi- 
fied significant areas of divergence within 
populations of rapists (Knight, 1999), child 
molesters (Knight, Carter, & Prentky, 1989; 
Lanning, 1986) and even sexual murderers 
(Schlesinger, 2004). Key dimensions associ- 
ated with this variation include primary 
motivation, intelligence, underlying sexual 
deviance, anger, opportunity, victim rela- 
tionship, level of force, and a broad range of 
other factors. 

Within sex offender typologies, the vari- 
ous constellations of these factors (some 
of which may be closely related) produce a 
wide range of potential offender subtypes, 
each associated with distinctive levels of 
risk. Further, and perhaps more critically, 
the triggers for re-offense may be mark- 
edly different across these subtypes. This 
factor presents significant challenges to the 


development of uniform methodologies for 
assessing dynamic risk. 

Of particular importance to community 
supervision practice is the fact that, to date, 
the evidence base has been weighted towards 
higher-risk offenders, particularly those who 
have been released following a prison sen- 
tence. Accordingly, the dynamic factors that 
may trigger re-offense among probationers 
who fall into lower long-term risk catego- 
ries are far less understood, and represent a 
critical area for future research (Hepburn & 
Griffin, 2004; Meloy, 2005). 


Conclusion 


Approximately 60 percent of sex offenders 
under correctional supervision in the United 
States are sent to serve their sentences in 
the community (Greenfeld, 1997). Moreover, 
sexual offenders comprise approximately 5-6 
percent of individuals released to parole agen- 
cies, with an estimated 30,000 under parole 
supervision on a given day (Hughes, Wilson, 
& Beck, 2001). Considering these figures, the 
development of effective systems for com- 
munity-based supervision looms large in our 
overall approach to sex offender management. 

As more and more jurisdictions develop 
specialized capacity to manage sex offenders 
in the community (English et al., 1997), the 
demands for effective risk assessment have 
continued to expand. As such, it remains 
vital that the role of risk assessment, and 
consequently the methods that are employed, 
be bounded by the specific challenges faced 
by probation, parole, and community cor- 
rections agencies. This requires recognizing 
and adapting to the range of variation both 
in organizational-programmatic goals and 
within the population under supervision. 

On a final note, the role of risk assess- 
ment in community supervision practice 
cannot be divorced from the unique social 
and political context in which our soci- 
ety views sexual crime, its perpetrators, 
and its victims. With the issue of sexual 
offending remaining at the forefront of leg- 
islative agendas, and with persistent public 
misconceptions surrounding the nature of 
sexual crime, community supervision agen- 
cies operate in a political environment with 
a “zero tolerance” approach to errors, in 
which one tragic case can lead to widespread 
calls for system reform. In this environment, 
the imperative of targeted, adaptable, and 
responsive means of risk assessment should 
be evident. 
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THE LAST TWO DECADES have borne 
witness to a rise in the correctional popu- 
lation so colossal that it was previously 
inconceivable to practitioners and criminol- 
ogists alike. As a result, annual exercises in 
economic calisthenics have become com- 
mon practice for corrections administrators 
throughout the U.S. The complexity of such 
a population boom can only be fully appre- 
ciated by realizing that available resources 
have hardly kept pace. Expectations to do 
more with less, an emerging mantra among 
correctional administrators and _ practitio- 
ners, drive corrections professionals in a 
seemingly unending search for promising 
technological developments that might help 
to bridge the service gap created by surging 
offender populations and waning budgets. 
The deleterious effects of this current 
“crisis in corrections” are becoming so 
entrenched in local- and state-level prac- 
tices that some have remarked, “Corrections 
has become the Pac-Man of government 
budgets, gobbling up resources as legisla- 
tors seek to finance competing needs with 
shrinking tax revenues” (Pierce, 1991 as 
cited in Cullen, Wright, & Applegate, 1996, 
p. 70). A discussion of the burdens levied by 
this crisis would be remiss if it overlooked 
the presence of these issues at the federal 
level. In fact, according to a most recent 
year-in-review report, the federal probation 
and pretrial services system has recently 
suffered severe budgetary and, consequently, 
operational constraints (United States Pro- 


bation and Pretrial Services, 2005). Specifi- 
cally identified were operational and service 
restrictions resulting from unprecedented 
budget deficiencies. In an attempt to realize 
some relief through cost-saving practices, 
the internal administration of federal pro- 
bation and pretrial services recommended 
that: a) the agency move toward an evidence- 
based approach and implement a process to 
measure client outcome that will generate 
meaningful agency evaluations and subse- 
quent operation refinements; b) the agency 
strive to restructure staff workloads and 
prioritize resources, most notably by identi- 
fying offenders eligible for early release; and 
c) the agency remain resolute in its commit- 
ment to provide services of exceptional qual- 
ity in spite of fewer resources (United States 
Probation and Pretrial Services). 

Actuarial classification systems that yield 
valid measures of risk and criminogenic 
need hold considerable promise for cor- 
rectional agencies in these regards. First, 
beyond informing decisions about custody 
and service provision, initial and reassess- 
ment risk/need scores also provide outcome 
measures useful for evaluating both offender 
and agency success. Second, because clas- 
sification tools identify different levels of 
offender risk, the tools’ corresponding risk 
categories are inherently useful for restruc- 
turing staff workload, prioritizing agency 
resource expenditure, and identifying low- 
risk offenders for early release. Finally, actu- 
arial classification systems can do a great 
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deal for agencies concerned about the qual- 
ity of service provision. Classification tools 
can improve service quality by promoting 
resource, custody, supervision, and treat- 
ment decisions that are better informed, 
more accurate, and ultimately more useful 
(Latessa, 2003-2004). 


Classification 


Classification systems disaggregate het- 
erogeneous correctional populations into 
subgroups that maximize between group 
differences and minimize within group dif- 
ferences. Classification processes create these 
subgroups on the basis of offender charac- 
teristics relevant to correctional outcome, 
which in turn facilitates and justifies dif- 
ferential service provision. The use of classi- 
fication devices allows correctional agencies 
to simultaneously address multiple objec- 
tives, including improved predictive accura- 
cy, better informed treatment assignments, 
more effective supervision, and meaningful 
outcome analysis (Clements, 1996). While 
most correctional agencies currently use 
some type of classification system to guide 
decision making (Jones, Johnson, Latessa, & 
Travis, 1999), it is important to note that all 
classification systems are not created equal. 
The approach taken to behavioral pre- 
diction can significantly affect the resul- 
tant validity. Decisions involving behavioral 
forecasts are grounded in either a clinical or 
actuarial approach (Meehl, 1954). Whereas 
clinical decision making is intuitively based" 
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and justified by claims of training, experi- 
ence, and expertise, actuarial decision mak- 
ing is based on scientific evidence generated 
from observed behavioral outcomes for risk- 
similar offenders (Monahan, 1981). Research 
investigating the accuracy of each method 
consistently supports the superiority of actu- 
arial classification decisions over clinical 
practices (Gottfredson & Gottfredson, 1986; 
Grove, Zald, Lebow, Snitz, & Nelson, 2000). 

In addition to the importance of how 
predictions are reached, evidence also sug- 
gests that what is assessed can greatly affect 
the accuracy and utility of classification 
decisions. While the majority of actuarial 
classification devices rely exclusively on his- 
torically-based static risk factors (e.g., crimi- 
nal history) to predict reoffending, others 
have evolved in response to recent knowl- 
edge advancements in risk prediction. Heed- 
ing the findings evidenced in current risk 
factors research (for example, see Gendreau, 
Little, & Goggin, 1996), researchers have 
developed advanced classification tools to 
augment traditional static-driven prediction 
models with a battery of predictors said to 
be dynamic because of their present-day (as 
opposed to life-history) focus. The resultant 
combination of both static and dynamic risk 
factors yields greater predictive accuracy 
for classification systems. For example, in 
an examination of male parolee revocation, 
Brown (2002) found that combined static 
and dynamic prediction models were signifi- 
cantly better at forecasting revocation than 
either static or dynamic models alone. Addi- 
tionally, in a quantitative literature review 
comparing the averaged correlation between 
static risk factors and recidivism to the 
averaged correlation between dynamic risk 
factors and recidivism, findings supported 
the greater predictive accuracy of dynamic 
factors (Gendreau et al.). 

Beyond improving the accuracy of pre- 
dictions, devices that measure risk through 
a combination of static and dynamic predic- 
tors offer considerable utility to correction- 
al agencies. Because dynamic risk factors 
are characteristics currently present in an 
offender’s life, they are inherently sensitive 
to detecting change and measuring progress 
(Taylor, 2001). Measuring both static and 
dynamic risk factors thus becomes a way 
of not only improving risk management 
but moving toward risk reduction (Latessa, 
2003-2004). That being said, it is important 
to note that risk reduction depends upon 
correctional agencies matching their service 


intensity levels according to actuarial-deter- 
mined offender risk levels (Lowenkamp & 
Latessa, 2005). Empirical investigations have 
consistently established that to be effec- 
tive, correctional agencies must direct a 
majority of their resources toward high-risk 
offenders. Correctional practices informed 
by this approach are commonly referred 
to as adhering to the risk principle; they 
offer evidence-based services entrenched 
in research findings that have consistently 
demonstrated efficacy in reducing offend- 
ing while also avoiding the iatrogenic effects 
associated with servicing low-risk popula- 
tions (Andrews, Zinger et al., 1990; Dowden 
& Andrews, 1999a, 1999b; Lowenkamp 
& Latessa, 2004a; Lowenkamp & Latessa, 
2004b). 


The Level of Service Inventory-Revised 


One example of an actuarial classification 
system that measures risk and crimino- 
genic need is the Level of Service Inventory- 
Revised (LSI-R). The LSI-R measures 54 
risk and need factors about 10 crimino- 
genic domains that are designed to inform 
correctional decisions of custody, super- 
vision, and service provision (Andrews & 
Bonta, 1995). The theoretically informed 
predictor domains measured by the LSI-R 
include criminal history, education/employ- 
ment, financial situation, family/marital 
relationships, accommodation, leisure and 
recreation, companions, alcohol or drug use, 
emotional/mental health, and attitudes and 
orientations (Andrews & Bonta). 

The LSI-R assessment is administered 
through a structured interview between the 
assessor and offender, with the recommenda- 
tion that supporting documentation be col- 
lected from family members, employers, case 
files, drug tests, and other relevant sources 
as needed (Andrews & Bonta, 1995). The 
total risk/need score produced by the LSI-R 
is indicative of the number of predictor items 
(out of 54) scored as currently present for the 
offender. The LSI-R score is then actuarially 
associated with a likelihood of recidivism 
that was derived from the observed recidi- 
vism rates of previously assessed offenders. 
Last, domain scores of the LSI-R are used to 
identify an offender’s most promising treat- 
ment targets (Andrews & Bonta). 

Because the LSI-R represents a theo- 
retically informed, empiricaily supported, 
actuarial-based, and standardized measure 
of criminogenic risk and need, it boasts 
considerable potential to improve caseload 


decisions, resource expenditure, and overall 
service quality (Andrews & Bonta, 2003; 
Gendreau et al., 1996). 


The LSI-R and Validity 


The benefits promised by any classifica- 
tion system must be empirically evaluated 
against the benefits actually observed in 
prior research evaluations. Research find- 
ings have generated a significant body of 
evidence that established the LSI-R as a valid 
predictor of correctional outcome across 
a variety of measures. Specifically, find- 
ings have supported the predictive valid- 
ity of the LSI-R for institutional infractions 
(Bonta, 1989), probation failure (Andrews, 
Kiessling, Robinson, & Mickus, 1986), half- 
way house failure (Bonta & Motiuk, 1985; 
Motiuk, Bonta & Andrews, 1986), and parole 
violations (Bonta & Motiuk, 1990). Cur- 
rent validity research on the LSI-R has also 
supported the tool as a promising predic- 
tor of future offending (Andrews & Bonta, 
1995; Goggin, Gendreau, & Gray, 1998). 
Moreover, empirical analyses reveal that the 
instrument’s accuracy in predicting future 
offending holds across correctional set- 
tings and offender populations (Holsinger, 
Lowenkamp, & Latessa, 2004; Lowenkamp, 
Holsinger, & Latessa, 2001). 

There are, however, warranted concerns 
about the population-specific nature of pre- 
diction tools. For instance, Wright, Clear, & 
Dickson (1986) tested the predictive valid- 
ity of the Wisconsin model risk assessment 
on samples of probationers in New York 
and Ohio. Though the Wisconsin model 
had demonstrated predictive validity for the 
sample upon which it was created (Baird, 
Hines, & Bemus, 1979), it failed to demon- 
strate predictive validity for either the New 
York or Ohio sample of probationers (Wright 
et al.). Specific to the LSI-R, research con- 
ducted by Dowdy, Lacy, & Unnithan (2001) 
failed to support the tool as a predictor of 
halfway house outcome, two-year recidivism 
for any crime, or two-year felony recidivism 
for a sample of halfway house offenders. 
Taken together, these two findings serve as a 
reminder to correctional agencies that clas- 
sification systems must be validated to their 
specific offender populations. 

The literature provided on the validity of 
the LSI-R has established the tool as a valid 
predictor of correctional outcome across 
offender types and settings. The information 
obtained from the LSI-R can increase the 
accuracy of important corrections-related 
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decisions (i.e., classification, risk level, crim- 
inogenic needs, service provision, intensity 
of interventions, and program effectiveness). 
In a similar vein, assessment research has 
also indicated a lack of universal applica- 
bility for prediction instruments. Federal 
probationers represent a unique correctional 
population in that they are older, more 
likely to be Hispanic, and more likely to be 
drug offenders than their state-supervised 
counterparts (Glaze & Palla, 2005; United 
States Probation and Pretrial Services, 2005). 
Because of this, and because of a current lack 
of existing research on the validity of the 
LSI-R for federal probationers, this research 
investigated the predictive accuracy of the 
LSI-R for a sample of federal probationers. 


Method 


Participants 


The sample in this study was comprised 
of 2,107 adult federal probationers. To be 
eligible for inclusion, a federal probationer 
had to have been assessed with the LSI-R by 
a federal probation staff member trained in 
the administration and scoring of the tool. 
LSI-R assessment scores for the sample were 
completed over a two-year period between 
December of 2001 and December of 2003. 


Procedures 


In 2001, the southwestern federal probation 
district that provided these data received a 
three-day training on the implementation 
and scoring of the LSI-R. Six months later, 
follow-up LSI-R training was provided for 
all staff and immediately followed by a “train 
the trainers” session for staff that had dem- 
onstrated exceptional LSI-R scoring skills. 
During the follow-up and “train trainers” 
sessions, administrative staff voiced concern 
about the tool’s applicability to federal pro- 
bationers and expressed interest in norming 
and validating the LSI-R on their offender 
population. This early discourse between 
federal probation staff and research consul- 
tants about the LSI-R’s psychometric prop- 
erties served as the impetus for what later 
matured into a collaborative effort between 
both parties to provide the agency with 
aggregated probationer needs reports, nor- 
mative information for their offender popu- 
lation, and evidence attesting to the LSI-R’s 
validity for federal probationers. 

Save for outcome, the variables of interest 
in this study were entered into an electronic 
database maintained by federal probation 


staff. Once the number of offenders in the 
database exceeded 2,000, federal probation 
staff sent a copy to the authors (via electronic 
mail). Upon receipt, the data were cleaned 
and then used to generate a data collec- 
tion sheet that individually listed all sample 
participants by their name, age, sex, ethnic- 
ity, race, and county of committing offense. 
These data collection sheets were then used 
to collect outcome data for the sample. 


Measures 


Although the LSI-R is comprised of ten 
risk and criminogenic need areas, only the 
composite LSI-R score was used in the cur- 
rent research. The LSI-R scores used are the 
result of offender interviews and collateral 
reviews of file and other offender informa- 
tion as completed by a federal probation staff 
member. Recidivism data were collected by 
completing follow-up record checks for each 
offender in the Federal Bureau of Prisons’ 
inmate locator database. The measure of 
recidivism used in the current study was 
incarceration in the Federal Bureau of Pris- 
ons for either a technical violation or new 
offense that occurred subsequent to the ini- 
tial LSI-R assessment. Recidivism was coded 
dichotomously, where a value of 1 indicated 
the occurrence of subsequent incarceration 
and a value of 0 indicated that subsequent 
incarceration had not occurred. 

Several demographic variables were also 
used in the current analyses to both describe 
the sample of offenders studied and further 
specify the relationship between the LSI-R 
and incarceration through their consideration 
in a multivariate analysis. The additional 
variables included in the multivariate analysis 
were age, sex, and ethnicity. While age was a 
continuous variable, sex and ethnicity were 
both coded dichotomously, so that a value of 


0 represented the most typical case (male and 
Hispanic, respectively) and a value of 1 repre- 
sented a departure from the most typical case 
(female and non-Hispanic, respectively). 

It should be noted that the Federal Bureau 
of Prisons’ database used to collect recidi- 
vism data only allowed for the determina- 
tion of incarceration in a federal institution 
subsequent to the initial LSI-R assessment. 
Alternate measures of outcome, such as 
commission of a technical violation, time 
until new commitment, or incarceration 
under state jurisdiction were unavailable. 
Though the use of additional outcome mea- 
sures would admittedly yield more informa- 
tion about the length of time until and type 
of recidivism, the use of subsequent incar- 
ceration is advantageous to the extent that 
it provides a more conservative test of the 
LSI-R’s predictive validity. 


Results 


Descriptives 


Descriptive statistics for offender demo- 
graphics, LSI-R scores and incarceration 
are presented in Table 1. An examination 
of these data reveals that the typical federal 
probationer in this sample was a Hispanic 
male, of 37 years, classified as low/moder- 
ate risk on the LSI-R (M = 14.08, SD = 7.81). 
The descriptive results also revealed a 26.1 
percent base rate of incarceration for the 
sample, indicating that nearly three out of 
four federal probationers had not recidivated 
prior to the completion of follow-up record 
checks for the sample. 


Validation 


The predictive validity of the LSI-R for fed- 
eral probationers was examined by conduct- 
ing three separate analyses. The first of these 


TABLE 1: 
Descriptives for the Sample (N = 2,107) 
Variable M SD 
Age S727 11.47 
LSI-R Score 14.08 7.81 
N % 
Sex 
Male 1551 43.0 
Female 556 26.4 
Ethnicity 
Hispanic 1,369 65.0 
non-Hispanic 738 35.0 
Recidivism 
No Reincarceration 1,557 73.9 
Reincarceration 550 26.1 
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involved calculating a predictive validity 
estimate of the relationship between the LSI- 
R and incarceration for the sample of 2,107 
federal probationers. This analysis supported 
the LSI-R as a significant predictor of subse- 
quent incarceration (r = .283, p < .01). 

The second test in this research investi- 
gated the validity of the LSI-R and incarcera- 
tion using receiver operating characteristic 
(ROC) analysis. The correlation coefficient 
calculated in the first analysis of this research 
represents the validity estimate most com- 
monly reported in existing LSI-R research 
endeavors. However, because the magnitude 
of a correlation coefficient is dependent on 
the percentage of the sample identified as 
a recidivist by the risk tool (selection ratio) 
and percentage of actual recidivists in that 
sample (base rate), it can be said that exist- 
ing research has yet to yield a statistic that 
would permit an unbiased comparison of the 
LSI-R’s predictive strength across samples 
and as compared to other prediction tools. 
To address this deficiency, ROC analysis 
was chosen as a second means of examining 
the LSI-R’s predictive validity in this study. 
The ROC method produces a statistic that, 
because it is unaffected by sample-specific 
base rates and selection ratios, is remarkably 
useful in comparing the utility and strength 
of a prediction instrument across different 
samples and vis-a-vis other prediction tools 
(Mossman, 1994a). 

Statistics derived from ROC curves rep- 
resent the ratio of true positives to false 
positives present in a given sample (i.e., as 
identified from predictions made by the risk 
instrument and observations for the out- 
come variable). The ROC analysis completed 
in this research produced an area equal to 
.689 (p < .01) to describe the relationship 
between the LSI-R and subsequent incar- 
ceration. Simply put, there was a 68.9 percent 
chance that a randomly selected recidivist 
had a higher score on the LSI-R than did 
a randomly selected nonrecidivist (Rice & 
Harris, 1995). 

The analyses performed thus far have 
revealed that the LSI-R was a valid and 
robust: predictor of subsequent incarcera- 
tion for federal probationers. To further 
specify this relationship, the final analysis in 
this study estimated a multivariate logistic 
regression model that examined the rela- 
tionship between the LSI-R and incarcera- 
tion while simultaneously considering the 
effects of age, sex, and ethnicity. The results 
of this multivariate analysis are reported 


in Table 2. An examination of the logis- 
tic regression model reveals that the LSI-R 
continued to be a significant predictor of 
incarceration, even when offender age, sex, 
and ethnicity were controlled. Moreover, a 
review of the values for Exponent(B) shows 
that the LSI-R was the strongest predictor of 
incarceration among the variables included 
in the multivariate model. This conclusion 
is also reached through an examination 
of the R values estimated for each of the 
predictor variables. The R value of .250 for 
the LSI-R is more than twice the R values 
estimated for the other significant predictors 
of incarceration included in the model (age 
and sex). Finally, the nature of the relation- 
ship observed between incarceration and the 
control variables of age and sex is also worth 
noting. The logistic regression analyses 
revealed a significant and negative R value 
for each variable, indicating female federal 
probationers and older federal probationers 
were less likely to recidivate than were their 
male and younger counterparts. 


Second, this study used ROC methods to 
calculate an unbiased measure of predictive 
accuracy appropriate for comparisons across 
samples and between different tools, a mea- 
sure absent in existing research, but critical 
for knowledge advancement. 

Results from the predictive validity analy- 
ses were encouraging and provided evidence 
that the LSI-R was a valid and robust pre- 
dictor of subsequent incarceration for this 
sample of federal probationers. Additionally, 
the multivariate analysis conducted in this 
research found that the LSI-R remained a 
valid predictor of subsequent incarceration 
when the effects of age, sex, and ethnicity 
were controlled. Taken together, these results 
make a strong case for the generalizability of 
the LSI-R to diverse offender populations. 
The findings of these, as well as previous, 
analyses further demonstrated that a theo- 
retically informed and empirically refined 
actuarial measure of client risk and crimi- 
nogenic need has much to offer correctional 
practice. When these findings are consid- 


TABLE 2: 


Logistic Regression Model Predicting Incarceration for the Sample (N = 2,107) 


95% Cl for 


Variable Exp(B) 

B SE Wald DF Sig R Exp(B) Lower Upper 
Age 0.030 .005 34.8265 1 .000 -.117 .970 .961 .980 
Sex -0.815 .134 37.0412 1 .000 -121 443 341 Bo 7 bs 
Ethnicity -0.036 1040 .747 .000 .964 774 1.202 
LSI-R 0.085 152.2118 1 .000 .250 1.088 1.074 1.103 
Constant -1.034  .211 23.9882 1 .000 


Note: -2 log likelihood = 2175.887; x7(4) = 236.335; p< .001; Pseudo R? = .156. 


Discussion 


This research sought to make two important 
contributions to the knowledge base of recidi- 
vism prediction. First, because the utilization 
of valid and efficient risk/needs assessment 
tools implies a certain level of improve- 
ment in the ability to manage offender case- 
loads and classify correctional populations, 
their use is becoming increasingly prevalent 
throughout North America. As the popular- 
ity of such prediction tools increases, so too, 
does the diversity of the offender population 
to which the tools will be applied. Conse- 
quently, it was deemed important to exam- 
ine the LSI-R’s efficacy to predict long-term 
outcome for a sample of federal probationers, 
a correctional population previously over- 
looked in existing LSI-R validation studies. 


ered in the context of existing research, use 
of the LSI-R to inform correctional decisions 
of supervision and service provision appears 
to epitomize evidence-based practices. 

In addition to further establishing the 
predictive validity of the LSI-R, these anal- 
yses contributed to existing research by 
using ROC methods to calculate an index 
of predictive accuracy that is independent 
of sample base rates and selection ratios. 
ROC methods are important in prediction 
research to facilitate comparisons of pre- 
dictive strength across samples and, more 
importantly, across prediction instruments. 
The ROC area of .689 generated in this 
research was moderate to large in magni- 
tude (Rice & Harris, 1995), and indicated 
that a randomly selected recidivist would 
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have a higher LSI-R score than would a ran- 
domly selected nonrecidivist 68.9 percent 
of the time. Beyond the importance of this 
finding for the accuracy of the LSI-R, it is 
hoped that the ROC analyses reported here 
prompt future prediction efforts (on all risk/ 
need assessment tools) to consider the inher- 
ent link between comparisons of predictive 
accuracy and knowledge advancement. 
Optimistic conclusions aside, there were 
several limitations present in this research. 
First, the outcome measure used did not 
permit analyses with respect to length of 
time until failure, failure due to technical 
violation, type of crime committed at failure, 
or failure under state jurisdiction. Addition- 
ally, these analyses examined the predictive 
accuracy of the LSI-R for a large sample of 
federal probationers, but did not investigate 
the tool’s applicability to offender subgroups. 
Certainly, existing research can benefit from 
future efforts that might further specify the 
relationship between the LSI-R and outcome 
across offender sex, ethnicity, and race. 


References 


Andrews, D., & Bonta, J. (1995). LSI-R: The Level 
of Service Inventory-Revised. Toronto, ONT: 
Multi-Health Systems, Inc. 

Andrews, D., & Bonta, J. (2003). The psychology 
of criminal conduct. Cincinnati: Anderson. 

Andrews, D. A., Kiessling, Robinson, D., & 
Mickus, S. G. (1986). The risk principle of 
case classification: An outcome evaluation 
with young adult probationers. Canadian 
Journal of Criminology, 28(4), 377-384. 

Andrews, D., Zinger, I., Hoge, R., Bonta, J., 
Gendreau, P., & Cullen, F. (1990). Does 
correctional treatment work? A clinical- 
ly relevant and psychologically informed 
meta-analysis. Criminology, 28(3), 369-404. 

Baird, C., Hines, C., and Bemus, B. (1979). 
The Wisconsin Case Classification/Staff 
Deployment Project: Two-Year Follow-Up 
Report. Madison, WI: Wisconsin Division 
of Corrections. 

Bonta, J. L. (1989). Native inmates: Institutional 
response, risk, and needs. Canadian Journal 
of Criminology, 39, 49-62. 

Bonta, J., & Motiuk, L. L. (1985). Utilization 
of an interview-based classification instru- 

ment: A study of correctional halfway 
houses. Criminal Justice and Behavior, 
12(3), 333-352. 


Bonta, J., & Motiuk, L. L. (1990). Classification 
to halfway houses: A quasi-experimental 
evaluation. Criminology, 28, 497-506. 

Brown, S. L. (2002). The dynamic prediction of 
criminal recidivism: A three wave prospec- 
tive study. Unpublished dissertation, Queen’s 
University, Kingston, Ontario, Canada. 

Clements, C. B. (1996). Offender classification: 
Two decades of progress. Criminal Justice 
and_ Behavior, 23(1), 121-143. 

Cullen, F. T., Wright, J. P., & Applegate, B. K. 
(1996). Control in the community: The limits 
of reform? In A. T. Harland (Ed.), Choosing 
correctional options that work: Defining the 
demand and evaluating the supply (pp. 69- 
116). Newbury Park, CA: Sage. 

Dowden, C., & Andrews, D.A. (1999a). What 
works for female offenders: A meta-ana- 
lytic review. Crime and Delinquency, 45(4), 
438-452. 

Dowden, C., & Andrews, D.A. (1999b). What 
works in young offender treatment: A meta- 
analysis. Forum on Corrections Research, 
11, 21-24. 

Dowdy, E. R., Lacy, M. G., & Unnithan, N. 
P. (2001). Correctional prediction and the 
Level of Service Inventory. Journal of Crimi- 
nal Justice, 30, 29-39. 

Gendreau, P., Little, T., & Goggin, C. (1996). 
A meta-analysis of the predictors of adult 
offender recidivism: What works? Crimi- 
nology, 34, 575-607. 

Glaze, L. E., & Palla, S. (2005). Probation and 
parole in the United States, 2004 (NCJ 
210676). Bureau of Justice Statistics Bul- 
letin. Washington, DC: U.S. Department 
of Justice. 

Goggin, C., Gendreau, P., & Gray, G. (1998). 
Associates and social interaction. Forum on 
Corrections Research, 10(3), 24-27. 

Gottfredson, S., & Gottfredson, D. (1986). The 
accuracy of prediction models. In A. Blum- 
stein, J. Cohen, & C. Visher (Eds.), Criminal 
careers and “career criminals” (pp. 212-290). 
Albany: SUNY. 

Grove, W. M., Zald, D. H., Lebow, B. S., Snitz, 
B. E., & Nelson, C. (2000). Clinical versus 
mechanical prediction: A meta-analysis. 
Psychological Assessment, 12(1), 19-30. 

Holsinger, A. M., Lowenkamp, C. T., & Latessa, 
E. J. (2004). Validating the LSI-R on a 
sample of jail inmates. Journal of Offender 
Monitoring, Winter/Spring, 8-9. 

Jones, D. A., Johnson, S., Latessa, E. J., & Travis, 
L. F. (1999). Case classification in commu- 
nity corrections: Preliminary findings 
from a national survey. Topics in Com- 
munity Corrections, National Institute of 
Corrections, U.S. Department of Justice, 

Washington, D.C. 


Latessa, E. (2003-2004). Best practices 
of classification and assessment. Jour- 
nal of Community Corrections. Winter 
2003-2004. 

Lowenkamp, C. T., Holsinger, A. M., & Latessa, 
E. J. (2001). Risk/need assessment, offender 
classification, and the role of child abuse. 
Criminal Justice and Behavior, 28(5), 
543-563. 

Lowenkamp, C.T. & Latessa, E. J. (2004a). 
Increasing the effectiveness of correctional 
programming through the risk principle: 
Identifying offenders for residential place- 
ment. Criminology and Public Policy, 4(1), 
501-528. 

Lowenkamp, C.T. & Latessa, E. J. (2004b). Res- 
idential community corrections and the 
risk principle: Lessons learned in Ohio. 
Ohio Corrections Research Compendium, 
2, 245-254. 

Lowenkamp, C.T. & Latessa, E. J. (2005). The 
role of offender assessment tools and how 
to select them. For the Record, 4th quarter, 
18-20, 

Meehl, P. E. (1954). Clinical versus statistical 
prediction. Minneapolis, MN: University of 
Minnesota Press. 

Monahan, J. (1981). Predicting violent behavior. 
Beverly Hills, CA: Sage. 

Mossman, D. (1994a). Assessing predictions of 
violence: Being accurate about accuracy. 
Journal of Consulting and Clinical Psychol- 
ogy, 62, 783-792. 

Motiuk, L. L., Bonta, J., & Andrews, D.A. (1986). 
Classification in correctional halfway hous- 
es: The relative and incremental predictive 
criterion validities of the Megargee-MMPI 
and LSI systems. Criminal Justice and 
Behavior, 13-46. 

Rice, M. E., & Harris, G. T. (1995). Violent 
recidivism: Assessing predictive accuracy. 
Journal of Consulting and Clinical Psychol- 
ogy, 63(5), 737-748. 

Taylor, G. (2001). Importance of developing 
correctional plans for offenders. Forum on 
Corrections Research, 13, 14-17. 

United States Probation and Pretrial Services 
(2005). Year-in-review report: Fiscal year 
2004. Washington D.C.: Administrative 
Office of the United States Courts. 

Wright, K., Clear, T., & Dickson, P. (1984). 
Universal application of probation risk 
assessment instruments: A critique. Crimi- 
nology, 22(1), 113-134. 


| 
| 
REAR 
} 
— 


RISK AND NEEDS assessment has been 
central to correctional operations for 
decades. Assessment not only helps predict 
offender future behavior, it can also help 
organizations allocate staff workload and 
resources. Before the late 1970s, judgments 
about offender risk were often subjective, 
based on experience or the intuition of cor- 
rectional practitioners (Solomon & Camp, 
1993). Objective systems began to appear in 
the 1970s and offered the promise of more 
efficient and systematic means of classifica- 
tion for offender risk and management than 
clinical intuition alone. The National Insti- 
tute of Correction’s model Risk Classifica- 
tion initiative, undertaken in the early 1980s, 
introduced many jurisdictions to objective 
case classification (Jones, Johnson, Latessa, 
& Travis, 1999). Today, risk and classifica- 
tion tools are used in a myriad of criminal 
justice decisions—from pretrial release to 
parole supervision for both juvenile and adult 
populations. More recent “third generation” 
instruments include criminogenic needs of 
the offender that should be addressed in 
order to reduce recidivism (Bonta, 1996). 
One of the most critical issues for assess- 
ment instruments is their predictive validity. 
An instrument should be able to accurately 
predict which offenders will and will not 
recidivate. Whether an instrument is selected 
from a number of commercially available 
products (such as the Level of Service Inven- 
tory and Correctional Offender Management 
Profiling for Alternative Sanctions) or devel- 
oped by a jurisdiction, it should be validated 
on the local population. The current article 


~eeweediscusses the validation of the San Diego 


Risk and Resiliency Checkup on a sample of 
juvenile offenders in Los Angeles County. 


Background 


Although the Los Angeles County Proba- 
tion Department routinely gathered back- 
ground information on youths entering its 
juvenile system, no validated risk assessment 
was being used through the early 2000s. As 
part of a court settlement regarding services 
provided to minority youth in the county, 
the department was required to allocate 
resources for the administration of a validat- 
ed risk and needs instrument to its juvenile 
probationers. Of particular importance was 
that the instrument work well for youths of 
all ethnicities. 

Working with a committee representing 
the parties of the court settlement, research- 
ers assisted in identifying and eventually 
validating a risk assessment instrument to be 
used in the county. After surveying instru- 
ments currently in use in the United States, 
we determined that items used in risk and 
needs instruments generally fell into one of 
nine conceptual categories: prior and cur- 
rent offenses/dispositions, family circum- 
stances/parenting, education, employment, 
peer relations, substance abuse, leisure/rec- 
reation, personality/behavior, and attitudes/ 
orientation. However, many of the instru- 
ments that we found in use had not been 
validated on the populations to whom they 
were administered, so that we were unable to 
determine their effectiveness in distinguish- 
ing high-risk youths from low-risk youths. 

We identified three instruments that had 
undergone validation: the Youth Level of 
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Service Inventory (YLSI) (Multi-Health Sys- 
tems Inc., 1998), the San Diego Risk and 
Resiliency Checkup (SDRRC) (Little, n.d.), 
and the Washington Association of Juve- 
nile Court Administrators Risk Assessment 
(WSJCA-RA) (Washington State Institute 
for Public Policy, 2004). Each includes mul- 
tiple items for the conceptual categories we 
identified, and each offered advantages and 
disadvantages when compared to the oth- 
ers. The Department favored the SDRRC, 
primarily because it could be administered 
during the intake process. It also preferred 
the SDRRC’s emphasis on positive (“protec- 
tive”) factors, whereas most risk and needs 
assessment instruments primarily focus on 
risk factors. The remaining settlement par- 
ties agreed, and the SDRRC was selected as 
the instrument to be tested. 


The San Diego Risk and Resiliency 
Checkup 


The SDRRC consists of 60 items in six 
conceptual categories, half of which are 
risk factors and half protective factors. The 
conceptual categories are delinquency, edu- 
cation, family, peer relations, substance use, 
and individual factors. Each conceptual cat- 
egory includes five protective factors and 
five risk factors. Each item is scored as “yes,” 
“no,” or “somewhat.” Scores from the risk 
and protective subscales are combined into 
a single resiliency score. The SDRRC also 
includes additional protective factors and 
additional risk factors that are not included 
in the resiliency score, but which may be 
used to tailor an individual’s supervision. A 
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copy of the SDRRC instrument is included in 
the Appendix. 

One important difference between the 
SDRRC and most other risk and needs instru- 
ments is that a higher score on the SDRRC 
implies higher resiliency, i.e., a lower score 
corresponds to a higher risk of re-offending. 
Most risk and needs instruments, by con- 
trast, associate high scores with high risk of 
recidivism. The SDRRC does not contain any 
preset cut-points for youth risk levels. 

The one existing validation study of the 
SDRRC was performed by Little (n.d.). This 
study included 2,633 youths surveyed in San 
Diego between February 5, 1999, and March 
28, 2001. The SDRRC was found to be effec- 
tive in predicting future offenses (Little, 
n.d.). The total resiliency profile appeared 
superior to either of the total risk and total 
protective scales. The correlation between 
the total resiliency profile and occurrence 
of a subsequent offense was -0.146 (p<.001). 
Using a logistic regression model to predict 
follow-up offenses, Little also found age, 
gender, ethnicity, and prior criminal history, 
as well as resiliency score, to be significant 
predictors of re-offending (Little, n.d.). 


Methods 


Selecting the Sample 


We wanted to assure adequate statistical 
power for detecting differences in recidivism 
rates between low-, moderate-, and high-risk 
youths, as well as differences between groups 
defined by race/ethnicity and gender. Because 
the SDRRC does not have any preset risk cut- 
points, the pilot study proposed to divide 
the sample into approximate thirds defining 
low-, moderate-, and high-risk groups. The 
probability of detecting a difference in recidi- 
vism rates between the three risk groups 
depends upon the number of groups (in this 
case, three); the sample size of each group, 
and the spread of the true rates of recidivism. 
Because we did not know the true rates of 
recidivism for the different risk groups, we 
proposed three plausible “true” scenarios for 
the probability of rearrest at 6 months for 
low-, moderate- and high-risk youths. 
The three were: 

Moderate 

21% 

25% 

18% 


Low 
11% 
12% 
12% 


High 
27% 
32% 
32% 


Scenario 1: 
Scenario 2: 
Scenario 3: 


With these three scenarios,' we deter- 
mined that at least 120 to 140 youths in each 
risk level would need to be included in order 
to be able to detect differences. However, 
we also wanted to be able to detect differ- 
ences for key subgroups: boys as well as girls; 
and for blacks, Hispanics, and white/other 
youths. Each of the subgroups of inter- 
est needed between 100 and 120 youths 
within low, moderate, and high-risk groups 
for adequate power. Therefore, we needed 
approximately 300 to 400 of each gender and 
each race/ethnic group. 

Our final sample size target was 1200 
youths for the study. This included 800 males 
and 400 females, and 400 each of whites, 
blacks, Hispanics. Because probation officers 
assess youths in both court- and non-court 
venues, we designated approximately 800 
court cases and 400 non-court cases.” 

Table 1 shows the full stratified target sam- 
ple, with the size of the sample in each cell. 

Four area offices were chosen for the 
assessment in order to provide county geo- 
graphical representation.’ These were Long 
Beach/Harbor (South), Pomona (East), Cen- 
tinela (West), and Van Nuys (North). Each 


TABLE 1: 
Sampling Design for Validation Study 


area office was to supply one-quarter of the 
target sample assessments. 


Training 

Probation officers volunteered for the assess- 
ment pilot. Originally 18 Deputy Probation 
Officers—14 field and 4 Camp Community 
Transition Program (CCTP)—were trained 
in the administration of the SDRRC. The 
three-day training consisted of an overview 
of the instrument; its application and prac- 
tice; overview of evidence-based practice, 
including the overview of the six crimino- 
genic needs and the eight guiding principles 
for risk/recidivism reduction; motivational 
interviewing techniques; and the actual 
administration of the tool. Training was 
conducted by staff from Justice System 
Assessment & Training, the firm that devel- 
oped the SDRRC. Deputy Probation Offi- 
cers (DPOs) were provided with an incentive 
of 30 minutes overtime payment for each 
assessment completed during the pilot. 


Data Collection 


Data were collected in three general areas: 
assessment scores, services received, and 
recidivism. 


Male 


Ethnicity Court 


Non-court 


Female 


Non-court Total 


Black 178 89 


Hispanic 178 89 


White/other 178 89 


44 400 
44 400 
44 400 


Total 534 


1200 


' These estimates are based on unpublished anal- 
yses from Turner & Fain (2003). 


? Police refer cases to the District Attorney in 
Los Angeles for processing. Youths charged with 
offenses for which the District Attorney must file 
and those youths who are detained in juvenile 
hall are directed to the Court for arraignment. 
The SDRRC was administered at this pre-plea 
stage for these “court” cases. 

Youths not initially referred to court—those 
generally with more minor offenses—are referred 
to Probation to make a determination of how 
to handle the case. These “non-court” cases can 
received a number of possible outcomes, includ- 
ing having the case closed, the youth being placed 
on informal probation, or the case being referred 
to court. The SDRRC was administered to “non- 
court” cases at this point. 


> Due to logistical restrictions, we were not able 
to pilot test the SDRRC in all area Probation 
Department offices. 


Assessment Scores. DPOs administered 
the assessments to youths. Information on 
each of the additional risk and protective 
factors that do not contribute to the over- 
all resiliency score was also recorded. The 
assessment form also includes demographic 
variables (age, gender, ethnicity), informa- 
tion about proficiency in English, and crimi- 
nal history. 

Assessments were conducted from Decem- 
ber 6, 2002, through October 30, 2003. A total 
of 1,165 youths were assessed by Los Angeles 
County probation officers. We also gathered 
information on whether the youth’s case pro- 
ceeded to supervision or ended at investiga- 
tion (no further probation supervision). 
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Recidivism. Using the Probation Depart- 
ment’s databases, we obtained information 
on arrests during the 12 months after assess- 
ment for each subject. These data include 
both juvenile and adult arrests. Date of each 
arrest, charges, and disposition were record- 
ed. We also used records from juvenile halls 
and juvenile camps to determine how many 
days a given youth was incarcerated during 
the 12 months after assessment. 

We were unable to determine whether a 
given youth was rearrested during the fol- 
low-up period for 129 (11.1 percent) of the 
1165 youths originally assessed for the study. 
Our final sample is 1,036 youths. Missing 
data were primarily due to incomplete dis- 
position records, so we were unable to deter- 
mine whether some youths were in custody 
(and therefore incapable of being rearrested). 
We found no significant differences on gen- 
der or age between the deleted cases and 
those in the final sample. Significantly more 
Hispanics, and significantly fewer blacks, 
were in the final sample than among the 
deleted cases (p < .001). 


TABLE 2: 
Demographic Characteristics (Unweighted) 
N (%) 
Gender 
Male 768 (74.1%) 
Female 268 (25.9%) 
Age 
9-12 65 (6.3%) 
13-14 240 (23.2%) 
15-16 404 (39.0%) 
17-18 322 (31.1%) 
19+ 5 (0.5%) 
Ethnicity 
White 194 (18.7%) 
Black 299 (28.9%) 
Hispanic 436 (42.1%) 
Other 97 (9.4%) 
Unknown 10 (1.0%) 
Case Type 
Court 782 (75.5%) 
Non-court 254 (24.5%) 


Investigation 294 (28.4%) 


Supervision 742 (71.6%) 


Weighting the Final Sample to Reflect 
Population of Probation Youths in Los 
Angeles 


Our final sample did not exactly match the 
target sample presented in Table 1 above. In 
particular, the final sample included some- 


what more males, more Hispanics, and more 
court cases than we had originally targeted. 
By weighting our final analysis sample to 
represent the entire population of investiga- 
tion and supervision cases for Los Angeles 
(as described below), we have adjusted for 
differences between the targeted sample and 
final sample, so that our analyses do accu- 
rately represent the gender and ethnic mix 
among all Los Angeles cases. 

In order to weight the final sample, we 
obtained the frequency of all youth investi- 
gation and supervision cases for Los Angeles 
during the same time period as the pilot 
assessment, with information on youth gen- 
der, race/ethnicity,* and court vs. non-court 
case type (see Appendix). Within each com- 
bination of gender, race/ethnicity, and court 
vs. non-court case type, we defined a weight 
to be the ratio of youths in the probation 
population to the number of youths in the 
final sample. This allowed us to weight the 
data to reflect the entire population on these 
characteristics.° All analyses were conducted 
on the weighted final sample. 


Results 


Mean Differences in Resiliency by 
Demographic Characteristics 


SDRRC resiliency scores differed by gender, 
age, and ethnicity; some differences were 
large enough to be statistically significant 
(see Table 3). The most pronounced differ- 
ences were for different ethnic groups, with 
pair-wise comparisons of whites, blacks, 
Hispanics, and “other” race all producing 
significant differences. “Other” youths (pri- 
marily Asians) had the highest mean resil- 
iency scores, followed by whites, blacks, 
and Hispanics, respectively. There were also 
marginally significant differences between 
males and females (t = 1.89, p < .06) and 
between youths aged 15 or 16 and those aged 
17 or 18 (t = -1.93, p < .06). 


Characteristics of the SDRRC 


The SDRRC comprises two subscales: the 
protective subscale and the risk subscale. 
Each of these contains subscales for delin- 


4 For weighting purposes, ethnicity was divided 
into five categories: black, Hispanic, white, other 
race, and unknown. Age was categorized as less 
than 13, greater than 18, and single years of age 
for ages 13-18. 


> We were not able to use age in calculating 
weights because there were too few representa- 
tives in the sample for some combinations of 
gender, race/ethnicity, court type, and age. 


TABLE 3: 


Mean Resiliency Scores by Demographic 
Characteristics (Weighted) 


Resiliency Number 
score in sample 
Gender 
Male 18.9 768 
Female 268 
Age 
9-12 23.1 65 
13-14 20.5 240 
15-16* 17.7 404 
17-18* 20.9 322 
19+ -0.3 5 
Ethnicity 
White 257° 194 
Black 299 
Hispanic 16.0* 436 
Other 32.6" 97 
Unknown 12.1 10 


* p< .05 using t-tests 


quency, education, family, peer, substance 
use, and individual. Table 4 shows the cor- 
relations between the overall resiliency score 
and the individual subscales. Table 5 gives 
the correlations among the subscales. It is 
important to note that the total SDRRC scale 
reflects “resiliency.” Resiliency is defined as 
the net sum of protective and risk factors. 


TABLE 4: 
Correlations Between Total Resiliency 
Score and Subscale Items (Weighted) 


Score Correlation 
Total protective score 0.93 
Total risk score 0.88 
Net risk for delinquency 0.85 
Net risk for education 0.81 
Net risk for family 0.81 
Net risk for peer 0.87 
Net risk for substance use 0.81 
Net risk for individual 0.88 
Delinquency risk factors 0.64 
Education risk factors 0.68 
Family risk factors 0.60 
Peer risk factors 0.70 
Substance use risk factors 0.54 
Individual risk factors 0.73 
Delinquency protective factors 0.75 
Education protective factors 0.78 
Family protective factors 0.81 
Peer protective factors 0.77 
Substance use protective factors 0.77 
Individual protective factors 0.82 


Note: All correlations significantly different 
from zero (p < .05). 
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TABLE 5: 


Correlations Among Resiliency Subscales (Weighted) 


Total 


Delinquency Education Family 


Peer Substance use Individual 


Total 
Delinquency 


1.00 0.85 


1.00 


0.82 
0.64 
Education 1.00 
Family 

Peer 

Substance use 


Individual 


0.82 
0.64 
0.62 
1.00 


0.88 
0.69 
0.66 
0.68 
1.00 


0.82 
0.65 
0.54 
0.60 
0.65 
1.00 


0.88 
0.71 
0.60 
0.64 
0.78 
0.70 
1.00 


Note: All correlations significantly different from zero (p < .05). 


TABLE 6: 


Arrested Within 12 Months of Assessment, by Resiliency Score (Weighted) 


Resiliency Score No 


Yes % of sample 


Low (12 or less) 
Medium (13-33) 
High (34+) 


64.5% 
84.5% 
91.8% 


35.5% 
15.5% 
8.2% 


35.8% 
33.6% 
30.6% 


Total 79.6% 


20.4% 100.0% 


Chi-square = 88.3 (p < .0001) with 2 degrees of freedom 


FIGURE 1: 


Percent Arrested During Follow-Up, by Age and Resiliency Score (Weighted) 


50% 


x 


45% 


40% 


35% 


30% 


25% 


20% 


15% 


10% 


0% 


Low (<=12) 


Protective and risk factors are scored dif- 
ferently. The higher the protective score, the 
more protective factors the youth has. Risk 
scores have negative values; the more negative 
the value, the higher the risk. Thus we would 
expect positive correlations between the total 
resiliency score and 1) total risk score, 2) total 
protective score, and 3) the subcomponents 
of both risk and protective scales. In fact, 
that is what we see in Table 4. At the same 
time, however, we see fairly high correlations 
between individual subscale items (see Table 
5), suggesting that they may be redundant. 


Medium (13-33) 


T 


High (34+) 


Redundancy among the subscales of the 
resiliency score was also reported by Little 
(n.d.) in her analysis of the SDRRC. 


Relationship Between Resiliency and 
Recidivism as Measured by Subsequent 
Arrest 


For each of the youths assessed, we deter- 
mined whether the youth was arrested 
within the 12 months following the admin- 
istration of the assessment. The major ques- 
tion for the validation study is whether 


scores obtained on the SDRRC are related to 
subsequent recidivism. 

One of the issues for recidivism studies 
is whether or not subjects are “at risk” to 
reoffend. Individuals may be removed from 
the sample before they have a chance to 
reoffend—they may be sentenced to terms 
of incarceration during the entire follow-up 
period. In some cases, these individuals may 
be excluded from analyses, or they may be 
treated as censored observations. In order to 
determine how large a problem this might 
pose for the current study, we calculated the 
number of days youths were “on the street” 
from the point of their assessment until 12 
months later. The vast majority of youths 
(over 90 percent) had at least 10 months of 
street time. For the remaining youths, analy- 
ses revealed that even those with very mini- 
mal “street time” (less than 2 months) were 
arrested. For this reason, we did not exclude 
any youths from our analyses of recidivism. 

Table 6 presents the recidivism results for 
the full sample. For this and other analyses, 
we divided the sample into approximate thirds 
and categorized the resulting groups as “low” 
(those with score 12 or less),° “medium” (those 
with scores between 13 and 33), and “high” 
(those with score of 34 or higher). Table 6 
shows that the scale does validate for the over- 
all sample. Only 8 percent of “high” resiliency 
youths were arrested, compared with almost 
36 percent of those with “low” resiliency. 


Subgroup Analyses 


Figures 1, 2, and 3 present the results by age, 
gender, and ethnicity, respectively. Within 
each of the major racial groups, the resil- 
iency score is significantly related to recidi- 
vism. Regardless of ethnicity, the higher 
the resiliency, the lower the likelihood of 
arrest for youths. The same holds true for 
males and females, and across all ages. The 
discriminatory power of the instrument 
appears to be greatest for the younger youths 
in the sample (age less than 15), most likely 
due to more variability in outcomes among 
younger juveniles. 


Assessing Scale Properties and 

Recidivism 

Prior analyses have examined the relation- 
ship between the total resiliency score and 
rearrest. In Table 7 below, we present the rela- 
tionship of individual subscales to rearrest. 
Recall that the more negative the risk score, 
the higher the risk. Thus we would expect a_ 


® This includes those with negative scores. 
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FIGURE 2: 


50% 


Percent Arrested During Follow-Up, by Gender and Resiliency Score (Weighted) 
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FIGURE 3: 
Percent Arrested During Follow-Up, by Ethnicity and Resiliency Score (Weighted) 
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negative correlation between risk subscales 
and rearrest. All subscales correlate signifi- 
cantly with rearrest. The absolute correla- 
tion between the total resiliency score and 
rearrest is 0.27—similar to the correlation 
observed by the Washington State Institute 
for Public Policy (2004) for misdemeanor 
and felony recidivism for the Washington 
Pre-Screen Assessment inventory. Interest- 
ingly, it is higher than the correlation report- 
ed by Little (n.d.). Resiliency scores have a 
higher correlation than do their respective 
protective and risk subscales with only one 


(family protective factors). 


Medium (13-33) 


High (34+) 


Controls for Additional Factors Related 
to Recidivism 


Earlier analyses have examined the uni- 
variate relationship between SDRRC score 
and recidivism. In the following analyses, 
we examine the relationship controlling for 
additional factors that may impact how well 
SDRRC predicts recidivism. These factors 
include age, gender, and race/ethnicity, as 
well as whether the case is supervision (vs. 
non-supervision) and court (vs. non-court). 

Table 8 presents the results from a logistic 
regression analysis of the total sample. We 
see that, even controlling for other factors 


TABLE 7: 

Mean Assessment Scores and 
Correlations with Arrest During Follow- 
Up (Weighted) 


Score ‘ Mean Correlation 
Total resiliency “0.27 
Total risk -0.24 
Net risk for delinquency 1.71 -0.24 
Net risk for education 2.02 -0.24 
Net risk for family 5.20 -0.19 
Net risk for peer 4.31 -0.24 
Net risk for substance use 3.89 -0.19 
Net risk for individual 2.42 -0.23 
Delinquency risk factors -2.84 
Education risk factors -3.25 -0.21 
Family risk factors -1.75 -0.13 
Peer risk factors -2.10 -0.19 
Substance use risk factors -1.65 -0.12 
Individual risk factors -2.58 -0.19 
protective 455 0.19 


Education protective factors 5.27 -0.23 
Family protective factors 6.94 -0.20 
Peer protective factors 6.42 -0.21 


Substance use 
protective factors 


Individual protective factors 5.00 -0.21 


Note: All correlations in this table are 
significantly greater than zero (p < .05). 


5.54 -0.19 


that might be related to recidivism, SDRRC 
resiliency is still significantly related to rear- 
rest. Other factors are also related to rear- 
rest: age (not being in the youngest or oldest 
age group’), being male (as opposed to being 
female), being black (as opposed to being 
white), and being under probation supervi- 
sion during the 12-month follow-up period. 
The overall measure of the model yielded a 
Wald chi-square value of 102.1 (p < .0001). 

The relatively lower correlations between 
SDRRC items and rearrest for Hispanic and 
“other” youths observed in Table 4 might 
suggest that the resiliency measure is not as 
strong a predictor for some groups as it is 
for others. In order to test this, we included 
interactions terms between race/ethnic- 
ity and resiliency in the model identified 
in Table 8 above. Results, shown in Table 9 
below, confirm that resiliency is differential- 
ly related to recidivism for whites (compared 
with Hispanics), although not significantly 
for blacks or “other” youths. 


? We include the square of age as a factor in the 
logistic regression because age has a curvilinear 
relationship with rearrest. Little (n.d.) used a similar 
analytic approach in her evaluation of the SDRRC. 
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TABLE 8: 
Logistic Regression Results for Arrest During Follow-Up (Weighted) 


Limitations of Current Research 


Research studies are subject to limitations, 


Variable Estimate Standard Error | Wald Chi-Square Pr > Chi-Sq and this one is no exception. Our follow-up 
Intercept -20.4302 6.7603 9.1329 0.0025 was limited to 12 months following youth 
Age 2.6623 0.9056 8.6421 0.0033 assessment with the SDRRC. Although this 
Age squared -0.0960 0.0301 10.1707 0.0014 provides a window of time over which to 
Male 0.9814 0.2408 16.6081 <.0001 observe behavior, longer follow-up time peri- 
Black 0.1976 0.2046 0.9325 0.3342 ods are preferable. Initially, a longer follow-up 
White 0.2881 0.2680 1.1556 0.2824 period had been planned, but the assessment 
Other race 0.1627 0.3532 0.2121 0.6451 phase took longer than expected. 
Supervision 1.5024 0.4206 12.7614 0.0004 As with many recidivism studies, our 
Court case 0.7125 0.3766 3.5786 0.0585 study relies on official records for mea- 
Resiliency 0.0285 0.00430 43.8683 <.0001 
have access to youth’s self-reported criminal 
TABLE 9: behavior, which can provide a more direct 
Logistic Regression Results for Arrest During Follow-Up, With Interaction Terms measure of criminal behavior (only a fraction 
(Weighted) of offenses result in arrest). Future research 
Variable Estimate Standard Error Wald Chi-Square Pr > Chi-Sq may want to examine the — to which the 
ae 300157 6.6641 9.0212 0.0027 SDRRC also corresponds with self-reported 
criminal behavior. To our advantage, how- 
ever, the pilot test was conducted before the 
Age squared 0.0935 0.0297 9.9074 0.0016 ‘swan: 
— validity testing was not contaminated by any 
Black — ange apse epsiuad system policies or practices that were based 
White 0.0476 0.3186 0.0223 0.8812 on classifications by the SDRRC. 
Other race 0.2739 0.4633 0.3496 0.5544 As indicated earlier, the SDRRC does not 
Supervision 1.5069 0.4182 12.9845 0.0003 have any predetermined cut-points for resil- 
Court case 0.6778 0.3752 3.2631 0.0709 iency. Without cut-points for classification, 
Resiliency -0.0217 0.00551 15.4913 <.0001 we could not conduct any meaningful analy- 
Resiliency*Black -0.0171 0.0103 2.7451 0.0976 ses of false positives and false negatives—or 
Resiliency*White -0.0313 0.0141 4.9051 0.0268 the extent to which errors in prediction are 
Resiliency*Other 0.000710 0.0140 0.0026 0.9596 made when using the SDRRC. Cut-points 


One of the questions we want to answer is 
whether the provision of services influences 
the youth’s recidivism. We would expect 
those receiving services might have lower 
recidivism rates. In order to evaluate this 
possibility, we tested the multiple logistic 
regression model presented above, with the 
inclusion of the number of services received 
by youths. Results of this regression showed 
that the number of services was positively 
correlated with recidivism. In other words, 
the more services received, the more likely 
the youth was to have an arrest during the 
follow-up period. This is most likely due 
to the fact that higher-risk youths are pro- 
vided more services. In fact, the correlation 
between SDRRC resiliency and the number 
of services was -0.19 (p < .0001). We con- 
ducted supplemental analyses in which we 
divided the sample into low-, moderate-, and 
high-risk groups and performed the regres- 
sion runs within each risk group. Results 
showed no significant relationship between 
the number of services and recidivism once 
youth resiliency was controlled for. 


Discussion and Conclusions 


Our analyses showed that the SDRRC has 
both internal and predictive validity for 
youth in the Los Angeles County Probation 
system. Total resiliency scores are correlated 
with total risk and total protective scores; 
subscales within risk and protective scores 
are significantly correlated with their over- 
all scales. Subscales are often highly cor- 
related with each other, however, suggesting 
a degree of redundancy in the instrument. 
The instrument and its subscales were sig- 
nificantly related to arrest for youths 12 
months after their assessment. The scale 
was also significantly related to recidivism 
for major subgroups of interest: youths of 
different ethnicities, as well as both males 
and females. In analyses which took into 
account other factors related to recidivism, 
the SDRRC remained a significant predictor 
of subsequent arrest. However, the scale does 
seem to work differently for some youths. In 
particular, the scale is not as strong a predic- 
tor for Hispanic youths as for other youths. 


will be determined during the implementa- 
tion phase of the instrument in Los Angeles. 
We recommend that sensitivity analyses be 
part of continued monitoring of the instru- 
ment once it has been integrated into Proba- 
tion practices (as described below). 

In addition, more thorough examination 
needs to be conducted on differences in the 
scales and subscales for different subgroups 
of youths. This should also be part of contin- 
ued monitoring of the instrument. 


Systemwide Implementation of LARRC 


In summer of 2004, the Los Angeles County 
Probation Department started the process to 
institutionalize the SDRRC, now referred to 
as the LARRC. Training on LARRC began 
on August 4, 2004. In December 2004, staff 
began completing the LARRC assessment 
utilizing an automated system. 

The Los Angeles County Probation 
Department has started a policy that requires 
all DPOs in the Juvenile Bureaus to assess 
and reassess minors assigned to their casel- 
oads at defined intervals as part of a plan to 


enhance case management services. As inves-... 
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tigators are trained in the administration of 
the LARRC, the assessment will be utilized 
at the investigation level (the point at which 
the pilot assessment was done) and will con- 
tinue through the supervision stages in order 
to address protective/risk/resiliency factors, 
update case planning efforts, and link minors 
to appropriate services and interventions. 
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Appendix 
DATE POJ # 
JAIN: 
YOUTH NAME (L/F/M) NICKNAME HOME PHONE 
RESIDENCE (STREET) GiTY Zip ALT. PHONE (SPECIFY) 
‘SCHOOL GRADE ETHNICITY INTERPRETER DESI 
OyoutH PARENT 


WHAT HAS ALREADY BEEN DONE FOR YOUTH/FAMILY? CASE TYPE 


court Non-court 


MINOR: SPEAKS ENGLISH: MINOR’S ASSESSMENT: PRIMARY LANGUAGE IN HOME: Mi 


INOR'S PREFERRED LANGUAGE: 
O proricicient Oumirep Cinone 
PARENT/GUARDIAN: SPEAKS ENGLISH: PARENT/GUARDIAN ASSESSMENT: PRIMARY LANGUAGE INHOME: | PARENT/GUARDIAN PREFERRED LANGUAGE: 
Oproricicient Ciumiteo 
‘PERSON COMPLETING THIS FORM: 
AGE AT FIRST ARREST # PRIOR ARRESTS 


ADDITIONAL PROTECTIVE FACTORS 


1 (No) Commitment to School 0 1 2 3 
2 (No) Recognition for Involvement in Pro-sociai Activities 0 1 2 3 
3 (No) Relations with parents / other adults 0 1 2 3 
4 (No) Parental Monitoring 0 1 2 3 
5 (Negative) Parent Evaluation of Peers 0 1 2 3 
6__| (No) Friends Engage in Conventional (Pro-social) Behavior 0 1 2 3 
7 (Not) intolerant attitude towards deviance 0 1 2 3 
8 No) Positive Social Orientation 0 1 7 3 
v 
OTHER RISKFACTORS 0 ANIMAL CRUELTY HEALTH PROBLEMS PEERS ARE SELF-MUTILATION 
OR CONCERNS: BLADDERCONTROL, HOMELESSNESS TOBACCO USE 
DAYTIME LACK OF (INAPPROPRIATE SEXUAL PREDATORY OF _ VICTIMOF: 
(PLEASE CHECK ALL © BLADDER CONTROL, BEHAVIOR CIDOMESTIC VIOLENCE 
BOXES THAT MAY NIGHTTIME LACK OF LOSS OR GRIEF QO HATE CRIME CIPHYSICAL ABUSE 
APPLY) CHRONICTARDINESS paRENTAL CIFOR PERSONAL GAIN 
EMOTIONAL DISTRESS ABUSE/NEGLECT CURACIALLY BASED 
O FIRE SETTING O PARENTAL REJECTION (SEXUALLY BASED 
(1 SCHOOLYARD BULLYING 
COMMENTS AND OBSERVATIONS: 
SUMMARY SCORE 
TOTALPROTECTIVEscorE [| TOTAL RESILIENCY SCORE 
TOTAL RISK SCORE ‘atte. TOTAL ADDITIONAL PROTECTIVE SCORE [| 
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[ INSTRUCTIONS: 
IF YOUR SELECTION IS NOT ABSOLUTELY AFFIRMATIVE, USE AN EXAMPLE: YES NO UNK 
ARROW POINTING IN THE DIRECTION YOU WOULD LEAN TO IF 
| GIVEN ANOTHER CHOICE. SEE EXAMPLE TO RIGHT. a 
DELINQUENCY - Protective no um inc 
1 SUPPORT/REINFORCEMENT IN COMMUNITY a a SCOR | PRIOR ARRESTS 
2  PRO-SOCIAL ADULT RELATIONS a a a SIGNIFICANT CHIME IN NEIGHBORHOOD 
3 EXTENSIVE STRUCTURED ACTIVITIES a o INFLUENCE WHILE UNDER 
4 PARTICIPATES IN FAITH COMMUNITY a ASSAULTIVE OF FIGH TING BEHAVIOR. 
y Risk Score 
EDUCATION - - Protective EDUCATION Risk 
14 SCHOOL ENGAGEMENT/ BONDS «a POOR ACADEMIC 
42 ATTACHMENTS W/ACADEMIC ACHIEVER "PATTERN OF TRUANGY PAST YEAR 
13 POSITIVE INTERACTIONS WITH TEACHERS OF SUSPENSIOWEXPELLED 5. 
15 CARING/SUPPORTIVE SCHOOL CLIMATE 
21 COMMUNICATES WITH FAMILY o POOR RELATIONS WITH PARENT(S) 
22 CONSTRUCTIVE USE OF TIME AT HOME a a "PARENTAL SUPERVISION DEFICIENCIES 
24 FAMILY SUPPORT a _ PARENTAL CRIMINALITY/ SUBSTANCE ABUSE. 


28 UNCONDITIONAL REGARD FROM A PARENT 


PEER - Protective 


31 POSITIVE PEER RELATIONS o SOCIALLY. ISOLATED. 
92 HAS AT LEAST ONE PERSON TO CONFIDE IN o o VERY. FEW PROSOCIAL AGQUAINTANCES - 
33 VALUES DIGNITY/RIGHTS OF OTHERS a o oO o HAS GANG APFILIATION/ASSOCIATION 
34 ABILITY TO MAKE FRIENDS o o HAS DELINQUENT FRIENDS. 
35 o o 


SUBSTANCE USE- 


YES SOME NO UNK 
Protective WHAT oe 
41 PARENTS MODEL HEALTHY MODERATION Q PATTERN OF ALCOHOLUSE 
42 EFFECTIVELY MANAGES PEER PRESSURE a a SUBST. (OTHER THAN 
43 YOUTH IS FREE OF DISTRESSING HABITS a o a USES SUBSTANCES FREQUENTLY 
44 YOUTH MANAGES STRESS WELL = SUBSTANCE USE INTERFERES WITH DAILY, 

SOME 
INDIVIDUAL - Protective vee 
51 VALUES HONESTY/INTEGRITY a PROSOGIAL INTEHESTS: 
62 SELF CONTROL ‘SUPPORTIVE: OF DELINQUENCY 
53 SELF EFFICACY IN PROSOCIAL ROLES o ANGER MANAGEMENT ISSUES 
54 PROBLEM-SOLVING SKILLS 0 oO oo SENSATION SEEKING 
65 PLANS, ORGANIZES, & COMPLETES TASKS a E 
individual Risk Subscaie 


| TOTAL PROTECTIVE SCORE | | TOTAL RESILIENCY SCORE | | TOTAL RISK SCORE | | 
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AFTER DECADES OF intellectual neglect, 
the field of corrections has decided to 
embrace the world of science and adhere 
to the dictums of “evidence-based” correc- 
tions. The term “evidence based” originates 
from the field of medicine as far back as the 
19th century in Europe and means many 
things to many people.' In medicine it is 
very important that medical procedures and 
the use of healing drugs and medicine actu- 
ally demonstrate their effectiveness through 
rigorous experimental studies before they 
are brought to market. In the social sci- 
ences, evidence-based research suggests that 
governmental policies must be shaped by 
scientific evidence that shows the policy has 
some cause and effect value. For many good 
reasons, the field of corrections has never 
had to pass such a high standard. But after 
American corrections has set world records 
in the numbers of persons incarcerated and 
placed on probation and parole, some crimi- 
nal justice professionals believe the field 
needs to get serious about its $60 billion a 
year industry and produce a better product. 
Plagued by recidivism rates that have 
remained stubbornly stagnant for 30 years 
(or more) and by a general feeling among 
most politicians that about the only thing 
that corrections can do is inflict widespread 
punishment, criminal justice practitioners 
have seen the more benign goals of treat- 
ment and rehabilitation take a back seat to 
the more politically appealing ideologies of 


' See http://bmj.bmjjournals.com/cgi/content/ 
full/312/7023/71. Also see http://www.ahrq.gov/ 
clinic/epc/ for a listing of the growing number 
of evidence-based practice medical and mental 
health centers in the U.S. and Canada. 


deterrence, incapacitation, and retribution. 
It’s a given that no politician can successfully 
run on a platform demanding more and bet- 
ter treatment for the two million plus prison- 
ers held in our nation’s jails and prisons. 
But the times are a changing. Led by a 
small number of Canadian and American 


criminologists, there is now a considerable 
effort to get rehabilitation and treatment 
back on the map. Their argument is adver- 
tised not as ideological but as empirical. The 
major premise is that treatment does work 
if it is done right. Therefore, the primary 
reason treatment is ineffective is because it 
is more often done wrong. 

One major reason that treatment is not 
done right is that offenders are not prop- 
erly assessed for risk by most correctional 
agencies. Without the proper diagnosis, it 
is not possible to assign prisoners to the 
proper treatment. Indeed, prior research has 
shown that assigning low-risk people to treat- 
ment they really don’t need actually increases 
recidivism. A recent evaluation of Ohio's 
community corrections act clearly shows that 
many correctional programs are not targeting 
the proper offender, which in turn diminishes 
the capacity to reduce recidivism rates.’ 

The widespread absence of risk assess- 
ment in corrections has historically ham- 
pered correctly targeted treatment. It was 
not until the 1980s that prison systems, 
due in part to a number of federal lawsuits, 
finally started using custody classification 
systems to assign prisoners to the correct 


2 Lowenkamp, Christopher T. and Edward J. 
Latessa. 2005. “Evaluation of Ohio’s CCA Funded 
Programs. Final Report.” Cincinnati, OH: Univer- 
sity of Cincinnati, Center for Criminal Justice. 


James Austin, Ph.D. 
The JFA Institute 


prisons. The results have been impressive in 
most states, with increasing numbers of pris- 
oners now being assigned to minimum secu- 
rity settings. The taxpayers have benefited 
somewhat because the lower the security, the 
lower the incarceration costs. Unfortunately, 
the huge increases in the correctional popu- 
lations have largely negated whatever savings 
taxpayers would have realized. 

Parole boards, which still govern the 
date and conditions of release for prison- 
ers in most states, have only recently (and 
only in a few states) embraced the idea that 
their decisions would be influenced by some 
calculation of the prisoner’s risk to recidi- 
vate. Probation and parole agencies have 
also begun to implement risk instruments 
to guide their decisions as to what levels of 
supervision are most appropriate for their 
burgeoning caseloads. 

But despite these advances, no jurisdic- 
tion can point to significant reductions in 
recidivism rates—and that includes Canada, 
from which most of the new emphasis on 
rehabilitation has emanated.* Many proba- 
tion and parole officers seem less interested 
in risk assessment and case management and 
more concerned with racking up as many 
violations of their caseload as they can. I 
don’t recall any prison, parole, or probation 
department being chastised for having too 
high a recidivism rate, even though there is 
considerable evidence that they could have a 
positive effect on these rates. 

The remainder of this article focuses 
on the state of risk assessment. I concede 
that in order for rehabilitation to have a 


> Austin, James. 2006. “What Should We Expect 
From Parole?” American Probation and Parole. 
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meaningful impact on recidivism rates, the 
proper identification of persons by their risk 
level is essential. But I now worry that the 
field is placing too much emphasis on risk 
assessment with little effort to provide those 
basic treatment services that are needed. 


The Basics of Risk Assessment 


Before an agency decides to adopt a risk 
assessment system, a number of tests need 
to be completed to ensure it will work. There 
seems to be a trend in corrections to uncriti- 
cally accept the latest “innovation” and 
adopt it without understanding its strengths 
and limitations. In risk assessment, unless 
these steps are completed, application of 
the risk assessment process may prove more 
harmful than helpful as offenders will be 
improperly classified. 


1. Risk Assessment Instruments Must Be 
Tested on Your Correctional Population 
and Separately Normed for Males and 
Females. 


There is a tendency for correctional agencies 
to simply borrow or buy an instrument that 
has been developed on another population 
that may not reflect the attributes of their 
own offender populations. In research terms 
this issue has to do with the “external valid- 
ity” of the instrument and the ability to gen- 
eralize the findings of a single study of the 
instrument to other jurisdictions. Gener- 
ally, if a risk assessment instrument has not 
been tested on multiple populations under 
varying conditions, it will not work well on 
populations it has not been tested on. 

Male and female risk assessment is 
another issue for proposed risk instruments. 
Men and women are different, behave dif- 
ferently, and respond differently to various 
forms of treatment and supervision. Yet 
when it comes to risk assessment we often 
assume they are the same. Recidivism and 
career criminal studies consistently show 
that females are less involved in criminal 
behavior, are less likely to commit violent 
crimes and are less likely to recidivate after 
being placed on probation or parole. Further, 
since the “criminal population” is largely 
male, any instrument that is tested on a total 
correctional population will naturally mis- 
classify females. 


2. An Inter-Rater Reliability Test Must Be 
Conducted 


Both an inter-reliability test and a valid- 
ity test must be completed by independent 


researchers who have no economic gain in 
proving the effectiveness of the instrument. 
Inter-rater reliability has to do with the accu- 
racy and consistency of the instrument being 
completed by those who will be responsible 
on a day-to-day basis for completing the 
form and interpreting the results. Often this 
work is done by probation and parole offi- 
cers or parole board hearing examiners. It is 
a skilled task that not all correctional staff 
are well suited for. 

The inter-rater reliability test would consist 
of taking a representative sample of offend- 
ers (a minimum of 100 cases) who will then 
be independently scored using the proposed 
instrument by two staff who have trained 
in the proposed instrument. Any item on 
the instrument that does not reach the 80 
percent agreement level should be deleted. 
If the instrument does not demonstrate an 
agreement level of 90 percent, it should not be 
implemented. 


3. A Validity Test Must Be Conducted 


The validity test is designed to see how well 
the risk factors actually predict recidivism. 
This test is done by drawing a sample of 
offenders who were sentenced to probation 
or released from prison and tracking them 
for a period of 2 to 3 years. Since most juris- 
dictions are anxious to have the risk assess- 
ment instrument implemented as quickly as 
possible, the validation sample often consists 
of persons sentenced or released 2 to 3 years 
prior to the study being conducted. The 
research must then be able to perform a vari- 
ety of bi-variate and multi-variate statistical 
tests to determine which items should be 
used, the weights assigned to each item and 
the proper risk level scale. 


4. The Instruments Must Allow For 
Dynamic and Static Factors That Have 
Been Well Accepted and Tested In A 
Number of Jurisdictions 


As noted above, the risk instrument should 
consist of static and dynamic risk items. 
Table 5 summarizes commonly used risk 
factors that have been repeatedly validated 
by a number of validation studies. These 
are separated into the static and dynamic 
categories. Of the two, the dynamic factors 
are generally the more powerful predictors, 
as they reflect the person’s current social and 
economic environment. If an instrument 
does not employ dynamic factors, it is likely 
to not perform accurately. 


5. The Instruments Must Be Compatible 
with the Skill Level of Your Staff 


There are a wide variety of risk assessment 
instruments available to jurisdictions to use. 
However, they require very different skill 
levels. The more traditional risk assessment 
forms generally consist of not more than 
10-12 items and are based on factual items 
that can be gleaned from court and case 
files and require minimal interpretation by 
staff trained in their use. Age at first arrest, 
current age, and number of prior probation 
violations within the past five years come 
under this category. For these instruments 
staff need little academic training to conduct 
an accurate assessment. 

The more complicated risk assessment 
items require a well-structured interview 
and a review of all relevant case file data. 
These instruments often have 40-60 items 
with several sub-scales reflecting varying 
domain risk levels. With such instruments it 
is more difficult to achieve the minimal lev- 
els of reliability and validity, unless the staff 
is highly skilled in the application of psy- 
chometric assessment forms. Without such 
skilled staff, the use of these instruments is 
not recommended. 


6. The Risk Assessment Must Have “Face 
Validity” And Transparency with Staff, 
Prisoners, Probationers, Parolees and 
Policy Makers. 


Finally, the instrument and the entire risk 
assessment process needs to be credible with 
all of the parties that are being directly 
impacted by it. Staff assigned to the risk 
assessment process must believe that the 
instrument actually works and will help 
inform the decision process for sentenc- 
ing, release, and supervision decisions. The 
decision makers (judges, parole boards, and 
correctional administrators) must also have 
confidence in the risk assessment process 
and demonstrate through their decisions 
that they are using it. In particular, statis- 
tics should show that offenders assessed as 
iow risk should have lower rates of being 
sentenced to prison, have shorter sentences, 
have high rates of being paroled and receive 
lower levels of supervision. High-risk offend- 
ers should show just the opposite trends. 
The people who are being assessed for 
risk must also believe that the process is 
credible and will be used by decision mak- 
ers. The process should also be transparent 
and not some mysterious process where 
the offender is unaware of what factors are 
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being used and how each is scored. This 
is especially helpful for risk instruments 
that employ dynamic risk factors—items 
that can change based on the offender’s 
social and economic situation (employ- 
ment, residency, and family relations). By 
understanding these dynamic risk factors, 
the offender can take actions or seek sup- 
port that will actually reduce the risk to 
public safety. 


A Closer Look at the LSI-R 


As the interest in risk assessment has grown, 
so too has the private industry engaged in 
developing and distributing these systems. 
Currently there are two major privately held 
risk assessment systems available to correc- 
tions. The most widely advertised system is 
the Level of Service Inventory—Revised (or 
LSI-R), which was first developed in Canada 
and has now been adopted by a number of 
U.S. correctional agencies.* LSI-R is owned 
and distributed by Multi Health Systems, 
Incorporated, which distributes a wide array 
of psychologically based assessment tools.° 
The other is the Correctional Offender Man- 
agement Profiling for Alternative Sanctions 
(or Compas), owned and distributed by the 
Northpointe Institute for Public Manage- 
ment, Inc., which also offers a privately held 
prison and jail classification system.° 

Few independent validation studies of 
these two systems appear in the literature. 
By “independent” I mean studies done by 
researchers who have no financial interest 
in the two companies. Because the LSI-R has 
been around longer and is more widespread 
than the Compas, there have been a few 
recent studies in Washington, Pennsylvania, 
and now Vermont. As will be shown below, 
these studies show that many of the indi- 


* For a recent summary of the validity of the LSI- 
R and its history see Girard, Lina and J. Stephen 
Wormith, 2004. “The Predictive Validity of the 
Level of Service Inventory- Ontario Revision on 
General and Violent Recidivism among Various 
Offender Groups” Criminal Justice and Behavior, 
Vol. 31, No.2:150-181 and Violent Recidivism 
among Various Offender Groups” Criminal Jus- 
tice and Behavior, Vol. 31, No.2:150-181. 


° For more information on MHS, Inc see their 
website at http://www.mhs.com/index.htm. 


® For more information about Northpointe see 
their website at http://www.northpointeinc.com/ 
contact.htm. 


vidual factors used in the LSI-R scale are not 
predictive of re-offending behavior.’ 

Why is this so? The principal problem 
with the LSI-R is that it is difficult to achieve 
a sufficient level of inter-rater reliability on 
many of its items. The LSI-R consists of 54 
items that are sorted into the following ten 
substantive areas believed to be related to 
future criminal behavior: 


Criminal History (10 items) 
Education and Employment (10 items) 
Financial (2 items) 

. Family and Marital (4 items) 
Accommodations (3 items) 

. Leisure and Recreation (2 items) 
Companions (5 items) 

. Alcohol and Drugs (9 items) 

. Emotional and Personal (5 items) 

10. Attitude and Orientation (4 items) 


ONAN 


The LSI-R scorer is expected to make 
either a dichotomous “yes” or “no” to 37 
items and a likert scale rating of satisfactory, 
relatively satisfactory, relatively unsatisfac- 
tory or very unsatisfactory for the other 17 
items. For example, one question in the 
family/marital domain requiring a level of 
satisfactory response is “dissatisfaction with 
marital or equivalent situation.” The scorer 
is instructed to base this assessment on a 
review of the case file data and an interview 
with the subject. On the accommodation 
domain, one question requiring a yes/no 
response is “three or more address changes 
last year.” Such questions and the associated 
response raise important questions about 
whether correctional staff (most of whom 
have little if any training in psychometric 
testing) can correctly use this assessment. 

Researchers in Washington State con- 
ducted one of the first independent valida- 
tion studies of the LSI-R as it was being 
applied to released state prisoners.® The 
authors found that the LSI-R criminal his- 
tory factors were strong predictors of recidi- 


? Washington State Institute for Public Policy. 
Washington’s Offender Accountability Act: An 
Analysis of the Department of Corrections’ Risk 
Assessment. December 2003. Olympia, Washing- 
ton; James Austin, Dana Coleman, Kelly Dedel- 
Johnson, and Johnette Payton. 2003. Reliability 
and Validity of the LSI-R Risk Assessment Instru- 
ment. Washington, DC: The Institute on Crime, 
Justice and Corrections at the George Washing- 
ton University; and James Austin, 2006. Vermont 
Parole Board Risk Based Guidelines, Technical 
Assistance Report #2. Washington, DC: National 
Institute of Corrections. 


® Washington State Public Policy Institute. 
2003. p.4 


vism and produced most of the predictive 
power for the instrument. Put differently, 
many of the numerous other LSI-R items do 
little to enhance the LSI-R predictive attri- 
butes. These findings led the researchers to 
recommend that some of the LSI-R items be 
combined with other non-LSI-R factors, like 
current age and gender, to provide for a bet- 
ter risk instrument. 

A recent study of the LSI-R as used by the 
Pennsylvania Parole Board and the Depart- 
ment of Corrections is instructive on dif- 
ficulties associated with the LSI-R scoring 
process.’ In particular, it provides the results 
of an inter-rater reliability study—a study 
that should be done for any risk assessment 
system. The Pennsylvania Parole Board was 
using the LSI-R scores to determine the 
suitability for release from prison. How- 
ever, there had been no attempt to validate 
the LSI-R on Pennsylvania prisoners, which 
no doubt are somewhat different from the 
Canadian prisoners on whom the LSI-R 
had been developed. Further, the concept of 
using LSI-R for parole release considerations 
suggests a serious mis-application of the 
LSI-R, since many of the items have to do 
with the prisoner’s life prior to incarceration. 
For example, how does one assess whether 
a prisoner who has been incarcerated for 
several years has “some criminal acquain- 
tances” or few anti-criminal friends”? Given 
that many months or years may have passed 
since the offender was living in the commu- 
nity, the problems of accurate recall and the 
relevance of the questions for prisoners are 
rather obvious. 

But even with these issues, one must 
also determine if the assessors are able to 
produce reliable scoring results. To this end, 
several reliability tests were conducted. The 
basic test is relatively straightforward and 
easy to do. A sample of 120 cases was select- 
ed for the test. Within two weeks, two staff 
were required to independently score the 
sampled cases and determine the appropri- 
ate score for each case. The results are shown 
in Table 1. The table contains only the 16 
items that reached the 80 percent level of 
inter-rater agreement. The other 38 items 
had scores in the 60-70 percent range. If we 
use the more generous criteria of risk level, 
the level of disparity is reduced but remains 


° Austin, James, Dana Coleman, and Kelly John- 
son. 2002. “Reliability and Validity of the LSI-R 
for the Pennsylvania Board of Parole and Proba- 
tion.” Washington, DC: The Institute on Crime, 
Justice and Corrections, The George Washington 
University. 
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at an unacceptable range, with a 29 percent 
disagreement on the risk level. It is also note- 
worthy that the items that have an acceptable 
reliability score are the more factual ones 
that are found in the more traditional risk 
assessment instruments. 

With such a level of “noise” in the scoring 
process, it is not surprising that only a few of 
the LSI-R items were found to be associated 
with recidivism. A recidivism study of 1,006 
prisoners who were scored on the LSI-R and 
had been released for at least one year was 
conducted. The first task was to perform an 
item by item test of 54 LSI-R scoring items to 
see which ones were associated with recidi- 
vism. This analysis showed that only the 
following items had a statistical association 
with recidivism: 


Any prior convictions? 

. Two or more prior convictions? 

. Three or more prior convictions? 

. Arrested under age 16? 

. Escape History? 

. Probation/parole suspension during 

prior community supervision? 

7. Three or more address changes the 
past year? 

8. Current drug problem? 

9. Drug problem related to law violations? 

10.Drug problem related to school or 
work problems? 

11. Mental health problems in the past? 


A regression analysis was done to see 
which of these 11 items had an independent 
effect on recidivism. This resulted in the 
following eight items being used: any prior 
convictions, two or more prior convictions, 
arrested under age 16, prior probation/parole 
suspension, three or more address changes 
within the last year, current drug problem, 
problem affecting school/work, and mental 
health treatment in the past. 

As shown in Table 3, only a small number 
of the 54 LSI-R scoring items are useful and 
most of them are not contributing to the 
risk assessment process. We also found that 
compared to the risk groups created by the 
full LSI-R, the condensed instrument creates 
risk categories with greater distinctiveness in 
terms of recidivism. Not only do these items 
have better predictive ability, but also they 
reduce the “high risk” category. According 
to this instrument, only 188 prisoners would 
be classified as “high risk,” compared to 
522 using the full LSI-R instrument. More 
importantly, the high-risk group created by 
the condensed instrument has a 69 percent 


recidivism rate, compared to the 58 percent 
recidivism rate of the LSI-R high risk group, 
indicating that the condensed instrument 
does a better job of selecting those prisoners 
representing the most significant danger to 
public safety. 

In Table 4, the analysis is taken a step fur- 
ther. Along with the eight LSI-R items in the 
condensed instrument, we also include these 


descriptive variables: age at release, mari- 
tal status, committing offense, and release 
type. This instrument, combining a small 
number of reliable LSI-R items with a few 
demographic items, produced the best risk 
assessment results. In this analysis, we are 
able to develop greater specificity within the 
“low risk” category and to identify groups 


TABLE 1: 
LSI-R Reliability Scores at the 80% Level 
Variable % Agreement 
1. Any prior convictions? 96% 
2. Two or more prior convictions? 93% 
3. Three or more convictions? 93% 
4. Three or more present offenses? 81% 
6. Ever incarcerated upon conviction? 95% 
7. Escape history from a correctional facility? 81% 
8. Ever punished for institutional misconduct? 87% 
9. Charge/probation/parole suspended during prior community supervision? 91% 
10. Official record of assault/violence? 86% 
11. Currently employed? 86% 
12. Less than regular grade 10? 85% 
13. Less than regular grade 122 88% 
14. Three or more address changes last year? 82% 
15. Drug problem, ever? 88% 
16. Moderate interference? 84% 
17. Severe interference, active psychosis? 93% 
18. Mental health treatment, past? 87% 
19. Mental health treatment, present? 89% 


TABLE 2: 


Cross-tabulation of the First and Second LSI-R Interviews of the Reliability Sample 


1st Interview 


2nd interview 


Low (0 through 15) 


Medium (16 through 22) 


Medium High Total* 


High (23 and above) 


2 


Total 


33 


% Disagreement: 29%; % Disagreement One Risk Level: 28% 


*Note: Two cases were not scored a second time and were excluded from this analysis. 
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TABLE 3: of prisoners with more distinct rates of 
Score using Select LSI-R Items by Failure Variable re-offending. 
Pennsylvania Board of Probation and Parole In Vermont a similar set of findings were 
Total Recidivated noted. The study was similar to the Pennsyl- 
Point Distribution N %, % vania one where the Parole Board was desir- 
TOTAL 948 100.0% 53.4% ous of adopting a risk instrument to guide 
0 7 1.8% 17.6% their decision-making process. Although 
' 17 1.8% 29.4% a formal reliability was not completed, a 
validation study was made on the LSI-R and 
= a. 4 gory other factors believed to associate with risk 
- : : to public safety. Two measures of recidivism 
Total Low Risk | _146 15.4% 33.6% rates were tested (return to prison and a 
4) 119 12.6% 52.9% new conviction), and became the basis for 
= MME. 12.1% 47.8% determining which items should be included 
3 6 | 133 14.0% 51.9% in the final risk assessment instrument. The 
E 7 |_ 160 16.9% 56.3% first step was to conduct a bi-variate statisti- 
8 | 87 9.2% 58.6% cal analysis to determine which items had a 
Total Moderate Risk | 614 64.8% 53.4% simple association with the three measures 
9] 115 12.1% 70.4% of recidivism. 
10| 23 2.4% 73.9% The original validation study was based on 
= 11 43 4.5% 60.5% 2,533 sentenced prisoners who were released 
= 12 1 0.1% 0.0% in 2002. Of this number only 644 had com- 
13 6 0.6% 83.3% pleted LSI-R scoring results. As was found 
Total High Risk | 188 19.8% 68.6% in the Washington state and Pennsylvania 
studies, only a relatively small number (13) of 
the LSI-R 54 factors are consistent and strong 
TABLE 4: predictors of recidivism (items 4, 8, 9, 11, 13, 
Score using Select LSI-R and Demographic Items by Failure 14, 16, 17, 31, 34, 39, 40 and 50). And another 
Pennsylvania Board of Probation and Parole set of variables that are not part of the LSI-R 
Total Recidivated was found to be associated with recidivism 
Point Distribution N % % rates. These included current age, marital 
TOTAL can 100.0% 33.2% status, education level, measures of institu- 
tional conduct, and completion of certain 
z programs while incarcerated (see Table 5). 
g 3 ae 2.7% 21.7% Linking Risk to Punishment and 
Total Lowest Risk | _ 31 3.7% 16.1% Treatment? 
4|_20 2.4% 45.0% The above studies show that risk assessment 
z 5 | 43 5.1% 37.2% is doable but that it need not be complicated 
6 | 64 7.5% 37.5% or expensive. Before adopting a particular 
Total Low Risk | _127 15.0% 38.6% system, an agency needs to rigorously assess 
7| 89 10.5% 51.7% what model it can afford and administer in 
2 8| 91 10.7% 46.2% a professional and accurate manner. If the 
E 9] 115 13.6% 52.2% wrong decisions are made in terms of what 
3 10 | 104 12.3% 57.7% model to buy you may end up with little ifany 
= 1 94 11.1% 56.4% enhancements to your ability to assess risk. 
Total Moderate Risk | 493 58.1% 52.9% I want to close on another matter that 
12/92 10.8% 62.0% seems to be receiving little attention; namely, 
B A 4.8% 78.0% the requirement to administer or provide the 
141.29 3.4% 89.7% proper “intervention” that is consistent with 
7 15 19 52% 59.6% risk. The major assumption in evidence- 
= based policy is that prisoners, probationers 
~_— a and parolees are to be “serviced” and pun- 
: ished relative to their risk. But reaching this 
standard can fail for two reasons. First, the 
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assessed risk level becomes moot if there are 
no high quality programs or interventions 
to assign the “client” to once the assessment 
has been completed. For example, in Ver- 
mont only 14 percent of the released pris- 
oner sample had completed an educational, 
substance abuse, or sex treatment program 
while incarcerated, even though 31 percent 
of the sample were assessed as high risk. 


On the other end of the spectrum, we 
need to recognize that a very large propor- 
tion of the prison, probation, and parole 
populations is low risk; these offenders are 
being punished and even treated beyond 
their threat to public safety. It’s like a hospi- 
tal that decides to provide intensive care for 
patients who have a cold—the treatment is 
not only unnecessary but expensive. 


It would be helpful for those in the risk 
assessment business to start advocating a more 
reasonable level of intervention that matches 
the risk they have so carefully calibrated. 


TABLE 5: 
Vermont Parole Board Final Risk Assessment Simulation Score by Item and Overall Risk Level 
Static Item N=644 Dynamic Item N=644 
1. Age at First Arrest* 8. Current Age 
16 years or older 458 71% 45% 50 and above 46 7% 20% 
Under 16 years 186 29% 60% 40-49 133 21% 33% 
2. Prior Charges/Suspensions under Supervision* 24-39 a2 50% 55% 
No 169 26% 36% Under 23 140 22% 61% 
Yes 475 74% 54% 9. Most Severe Disciplinary Report 
3. Crime Seriousness None in the past 12 months 450 70% 44% 
Level 1,2,4,5 and 10 334 52% 37% Major A or Major B 194 30% 59% 
Level 3,6,7,8,9 and 11 310 48% 61% 10. Completed Education/Substance Abuse Program 
Yes 87 14% 45% 
4. Drug/Alcohol Abuse* No 557 86% 50% 
None 388 60% 42% 11. Current Custody Level 
tlie 256 40% 60% Minimum 0 0% N/A 
5. Prior Convictions” Medium 0 0% N/A 
None 387 60% 45% Else 0 O% N/A 
One 109 17% 49% 12. Current Marital Status 
Two or more 148 23% 60% Married, Divorced 204 32% 43% 
6. Criminal Acquaintances at Admission* Single 440 68% 52% 
No 43 7% 26% Scored Risk Level 
Yes 601 93% 51% Low 151 23% 26% 
7. Employed at least 12 months prior to Admission” Moderate 293 45% 49% 
Yes 221 34% 41% High 200 31% 67% 
No 423 66% 53% 


*Denotes LSI-R factors 
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THE RISK ASSESSMENT process is under- 
going major change in federal, state and 
local community corrections agencies across 
the country. New assessment instruments 
are being introduced, case management sys- 
tems are being redesigned, and the roles and 
responsibilities of line staff and management 
in community corrections agencies are being 
redefined, in large part due to the applica- 
tion of new, “soft” computer technology in 
community corrections agencies (Pattavina 
and Taxman, 2006). As Gottfredson and 
Tonry (1987) predicted in the late 1980s, 
“both the literature and practical application 
of science-based prediction and classifica- 
tion will continue to expand as institutions 
evolve to become more rational, more effi- 
cient, and more just” (vii). While rationality, 
efficiency, and justice are laudable goals for 
any criminal justice organization, we sus- 
pect that ultimately, it is the effectiveness of 
the community corrections system—both 
in terms of short-term offender control and 
long-term offender change—that really mat- 
ters to the public, and by extension, to policy- 
makers and practitioners. In the following 
article, we examine three key issues related 
to assessing the effectiveness of risk assess- 
ment procedures that need to be addressed: 
1) evidence-based practice and link between 
risk assessment and risk reduction, 2) the 
implications of both actuarial and clinical 
assessment for line staff and management, 
and 3) the need to combine individual risk 


assessment and community risk assessment 
in the next generation of risk-driven com- 
munity corrections strategies. We conclude 
by offering three simple recommendations 
designed to improve the effectiveness of the 
risk assessment process in federal, state, and 
local community corrections agencies. 


Issue 1: Evidence-based Practice 
and the (Missing) Link Between Risk 
Assessment and Risk Reduction 


When the term “best practices” is used, 
it typically refers to the results of an evi- 
dence-based review of the research on a 
topic of interest (e.g. scared straight pro- 
grams, prison and community-based treat- 
ment programs, etc.). Essentially, there are 
three different types of evidence-based 
reviews: 1) the “gold standard” evidence- 
based review focuses only on randomized, 
controlled experiments; 2) the “bronze 
standard” evidence-based review includes 
both experimental and well-designed quasi- 
experimental research, while using non- 
experimental research studies to confirm 
findings from higher quality research; and 
3) the unscientific (or nonsense) review, 
which does not identify specific study review 
criteria, relying instead on a selected subset 
of all studies available for review on the 
topic of interest. Not surprisingly, the last 
category of unscientific reviews is usually 
written by advocates of a particular program 
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or strategy. In their most extreme form, the 
authors of the review simply allude to an 
evidence-based review or to “best practices,” 
with no supporting documentation. Unfor- 
tunately, much of what is currently available 
in the corrections field—both institutional 
and community corrections—falls into this 
last category. 

Applying the “gold standard” for evidence- 
based reviews to the “risk assessment” pro- 
cess in community corrections mandates that 
at least two randomized field experiments 
must have been conducted in this area before 
we can offer an assessment of “what works” 
(see, for example, the reviews conducted for 
the Campbell/Cochrane collaboration at 
www.campbellcollaboration.org). Unfortu- 
nately, no experimental research has been 
conducted on this topic in community cor- 
rections, leading us to conclude that we don’t 
know whether there is a link between risk 
assessment (i.e., classification of an offender 
into high-risk, medium-risk and low-risk clas- 
sification categories) and risk reduction (i.e., 
a lower rate of recidivism for offenders than 
anticipated, given their risk level) due to the 
types of supervision and services we make 
available to offenders at each level of risk. 

Much of what we currently do in com- 
munity corrections is based on assumptions 
about the risk-reduction effects of placing 
offenders into different supervision levels 
that have not been tested empirically, using 
randomized field experiments. What would 
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happen, for example, if we placed high-risk 
offenders under “medium” or “low” super- 
vision? Alternatively, what would be the 
impact of placing a low- or medium-risk 
offender under “maximum” supervision? 
Until we have the results of quality, experi- 
mental research to review, we will continue 
to make assumptions about “what works and 
“best practices” in terms of both risk assess- 
ment and risk reduction that are simply not 
supported by a careful, “gold-standard” evi- 
dence-based review. 


Issue 2: The Implications of 
Actuarial and Clinical Assessment 
for Line Staff and Management 


One argument that can be made concerning 
the use of clinical vs. actuarial risk assess- 
ment is that the line staff currently hired 
in community corrections do not have the 
background and qualifications necessary to 
conduct “clinical” assessments of offender 
risk, particularly for special category (e.g. 
mental health, substance abuse, sex offend- 
er) and multiple problem offenders. Assum- 
ing for the sake of argument that you want to 
introduce clinical assessments into your fed- 
eral, state, and local community corrections 
agency, you have two choices: 1) recruit/hire 
line staff with the necessary qualifications 
to conduct clinical risk assessments (perhaps 
with minimal additional training); or 2) 
privatize the assessment process, using the 
network of current mental health treatment 
providers as your “target” potential provider. 
Indeed, it could be argued that by moving 
away from clinical and toward actuarially- 
based risk assessments, we are attempting 
to simplify classification/decision-making 
in an effort to reduce the need for higher- 
skilled line staff (i-e., the “dummying down” 
of community corrections). In this scenario, 
it is possible to envision a probation or 
parole agency where line staffs are respon- 
sible for case planning and supervision, but 
other functions (assessments, treatment, and 
services) are subcontracted to agencies in the 
private sector. 

While improving staff quality and/or 
privatization are options to consider even if 
you are not using clinical assessments, the 
evidence available from major reviews of the 
available research certainly suggests that you 
will not improve the risk assessment system 
using clinical assessments, because actuarial 
risk assessments consistently “outperform” 
clinical risk assessment procedures (see Har- 
ris, this issue; and Gottfredson and Mori- 


arty, this issue). However, we should point 
out that the new generation of “actuarial” 
risk assessment instruments currently being 
used in community corrections agencies 
-including the popular LSI-R instrument 
discussed in several articles in this issue 
~actually requires both objective and sub- 
jective (or clinical) assessments by line staff. 
In fact, the distinction between actuarial 
and clinical assessment is becoming blurred, 
with consequences for line community cor- 
rections personnel (and management) that 
are important to consider. It is our conten- 
tion that if we continue further in this direc- 
tion, then changes in either staff quality or in 
the privatization of the assessment function 
may be needed. 

According to a recent review by Brum- 
baugh and Steffey (2005), three of every four 
probation and parole agencies in this country 
employ “objective” risk/needs instruments 
to classify offenders, using either the shorter 
Wisconsin risk/needs assessment instrument 
or the longer, 54-item LSI-R mentioned earli- 
er. Both instruments require line staff to make 
both objective assessments (such as prior con- 
victions, current employment) and subjective 
assessments (such as extent of drug problems, 
attitude, mental health). Not surprisingly, the 
results of a number of inter-rater reliability 
studies reveal that line community correc- 
tions staff are much more consistent in their 
scoring of objective than subjective items (see, 
e.g. Austin, this issue; Byrne and Robinson, 
1991; and Harris, this issue). 

The use of a large number of items in a 
risk instrument is likely to exacerbate the 
inter-rater reliability problem. Austin (this 
issue) for example, pointed out that (in an 
inter-rater reliability study he conducted) 
of the 54 items included in the LSI-R (37 
yes/no items and 17 likert scale items), only 
16 items had an agreement rate of 80 percent 
or higher, with 38 other items scoring in the 
60-70 percent range. Overall, Austin found 
that these scoring differences on individual 
risk items resulted in disagreement on the 
scoring of the offender’s risk level in 29 
percent of the 120 cases reviewed by the two 
staff members included in the inter-rater 
reliability test. 

We can only speculate about how such 
differences in the scoring of individual risk 
items would affect risk assessment (and 
the classification of offenders into high-, 
medium-, and low-risk categories) across 
an entire department. However, Austin’s 
research certainly suggests that new strate- 


gies need to be developed to improve the 
level of inter-rater reliability before the agen- 
cy embarks on the time-consuming risk 
assessment construction/validation process. 
Our recommendation is to collect data on 
fewer items, focusing primarily on objective 
items that are relatively easy to code. Austin 
(this issue) found that he could improve both 
the reliability and validity of the LSI-R by 
focusing on a subset of only 8 of the 54 origi- 
nal LSI-R items. According to Austin, “not 
only do these items have better predictive 
ability, but also they reduce the “high-risk” 
category” (this issue). Since most observers 
(see, e.g., Lowenkamp and Latessa, 2005) 
recommend providing the highest level of 
supervision and services to high-risk offend- 
ers, the cost effectiveness of a more precisely 
defined—and smaller—high-risk classifica- 
tion category should be obvious. 

In addition to Austin’s research, the find- 
ings from other inter-rater reliability studies 
indicate that line staff characteristics (such 
as age, gender, race, location, experience) 
will likely affect the scoring of risk assess- 
ment instrument items in ways that are 
important to consider. For example, Byrne 
and Robinson (1992) identified gender bias 
as a potential problem affecting inter-rater 
reliability. In their study of inter-rater reli- 
ability among 130 probation officers, they 
distributed two different versions of a “case 
study”: in version A, the juvenile (Sandy) was 
described as female; in version B, the juvenile 
was described as male. There were no other 
differences between the two “case studies.” 
Significant differences in overall risk scor- 
ing were identified, with the female version 
of the case receiving higher scores than the 
male version of the same case, resulting in a 
greater proportion of the female cases being 
classified as high risk (40.4 percent) than 
their “male” counterparts (33.1 percent). 

We suspect that in addition to variations 
in scoring and consistency due to offender 
characteristics (such as gender, race, class), 
there will be variation in scoring and con- 
sistency due to the characteristics of the 
line probation/parole officers completing 
the assessment (see, e.g. Byrne and Rob- 
inson, 1990).The findings from both the 
Austin study and Byrne and Robinson study 
underscore the importance of conducting 
an inter-rater reliability study, not only to 
support initial risk instrument development, 
but also to examine the very real possibility 
that bias (related to both the characteristics 
of offenders and the characteristics of line 
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staff) is having a detrimental effect on the 
risk assessment process. 

One other area where clinical (subjec- 
tive) judgment enters into the classification 
process is in the agency’s risk scoring “over- 
ride” policy. While there will undoubtedly 
be circumstances where an offender will be 
either over-classified or under-classified by 
line staff members, and/or by management 
decisions to ignore risk altogether (such 
as offense exclusions for sex offenders), it 
is critical that the level, type, and circum- 
stances of over-ride usage be monitored on 
an ongoing basis. A simple rule of thumb for 
this type of review is to apply a 10 percent 
rule: if more than 10 percent of the agency's 
risk scoring decisions are being changed, 
then the agency has a problem in this area 
that needs to be resolved. 

Finally, our discussion of the utiliza- 
tion of clinical and actuarial risk instru- 
ments in community corrections would be 
incomplete without mention of the “valida- 
tion” problem. According to a recent study 
by Hubbard, Travis, and Latessa (2001), 
“only 30 percent of the agencies that use 
an assessment instrument reported that the 
instrument was validated for their local pop- 
ulation” (as summarized by Brumbaugh and 
Steffey, 2005:59). Without the completion of 
the necessary validation research, there is no 
way of knowing whether the risk instrument 
being used by a particular agency results in 
an accurate classification of offenders into 
the low-, medium-, and high-risk categories 
used to allocate scarce probation and parole 
resources, both in terms of supervision and 
in terms of services. 

While there certainly has been much dis- 
cussion of the need to apply the “risk prin- 
ciple” (risk, needs, responsivity) to offenders 
supervised in the community corrections 
system (see, e.g. Andrews, et al., 1990, 
Latessa and Lowenkamp, 2005), it appears 
that the determination of “risk level” may 
be inaccurate for a significant number of 
these offenders. Recent attempts to improve 
the risk assessment process using actuarial 
instruments may have made matters worse, 
because of the inter-rater reliability prob- 
lems associated with the more complex risk 
assessment instruments currently being 
used, such as the 54-item LSI-R. The impli- 
cations of our brief review of the use of 
actuarial vs. clinical assessment are straight- 
forward: in order to improve both reliability 
and validity, risk assessment instruments 
need to be designed using a small number of 


objective risk items and tested (for reliability 
and validity) on an ongoing basis. 

We anticipate that the continued develop- 
ment of LSI-R type assessment instruments, 
along with the use of offender-specific 
assessment devices (for categories such as 
sex offender, mentally ill offender, substance 
abusing offender, multiple-problem offend- 
er) will require more qualified line staff 
and/or the use of private sector assessment 
centers. However, current line staffs in com- 
munity corrections agencies are certainly 
qualified to classify the risk level of offenders 
using the simplified versions of the actuarial 
risk instruments advocated by Austin (this 
issue). The challenge for community cor- 
rections is to allocate resources to offenders 
placed in various risk classification levels 
in a manner that maximizes the system’s 
overall effectiveness. Getting the assessment 
“right” is the critical initial step, but it must 
be followed by improvements in treatment 
classification, and subsequent case planning 
strategies (see Taxman, this issue). 


Issue 3: The Need to Combine 
Individual and Community Risk 
Assessment 


A number of recent evidence-based reviews of 
the research in community corrections have 
identified statistically significant, but mod- 
est (10 percent) recidivism reduction effects 
associated with a variety of community treat- 
ment strategies (see Welsh and Farrington, 
2006). We suspect that the recidivism reduc- 
tion effects identified in these studies would 
be even more pronounced if individual-level 
assessments of risk were combined with com- 
munity-level risk assessments (Byrne, 2006; 
Pattavina, Byrne, and Garcia, 2006), based 
on the premise that community-level risk 
assessment is a necessary “first step” in the 
community change process. 

We offer this assessment based on two 
related factors: first, there is a large body 
of research supporting the notion that an 
individual’s risk of re-offending is affected 
—both positively and negatively—by the 
community in which he/she resides while 
under community supervision (Sampson 
and Bean, 2005; Sampson and Raudenbush, 
2004; Pattavina, Byrne, and Garcia, 2006). 
Second, the treatment resources available 
to offenders will also likely vary by the “risk 
level” of the neighborhood, with higher-risk 
neighborhoods offering fewer (and lower 
quality) treatment options to offenders liv- 
ing in these areas (Jacobson, 2006). Accuracy 


of the individually-based risk classification 
system will likely improve with the inclu- 
sion of overall community risk level (high 
vs. low/medium risk, for example, based 
on offender density and/or the area’s crime 
rate), along with selected community “risk” 
characteristics (such as unemployment rate, 
proportion of residents living in poverty, 
size/characteristics of first generation immi- 
grant population). Similarly, the accuracy of 
the individually-based treatment classifica- 
tion system (linking offenders at different 
risk levels to appropriate treatment) would 
also be improved by an assessment of com- 
munity risk level, because this classification 
decision could be based on an assessment 
of the likely impact of community culture 
(such as attitudes toward substance use, 
criminal thinking, etc.) on the attitudes 
and behavior of offenders residing in “high- 
risk” and low/medium-risk neighborhoods 
(Sampson and Bean, 2005). 


Concluding comments 


While there have been significant improve- 
ments in the individual offender assessment 
procedures used by community corrections 
agencies over the past two decades, our 
brief review suggests the following: 1) we 
need to conduct high quality experimental 
research on the effectiveness of both risk and 
treatment classification systems, using risk 
reduction as our primary outcome measure; 
2) we need to consider simpler alternatives 
to both the general (e.g. LSI-R) and offender- 
specific (e.g. mentally ill, substance abuser, 
sex offender) risk assessment devices; and 3) 
we need to incorporate community-level risk 
factors into our current assessment system. 
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[The federal probation and pretrial services 
system is making steady progress towards a 
comprehensive outcome-measurement  sys- 
tem'' that will allow the federal judiciary’s 
policymaking body, the Judicial Conference 
of the United States, and chief probation and 
pretrial services officers and system managers 
to make decisions based on ongoing empirical 
analysis of reliable and pertinent data. We 
expect to have much of the data collection 
infrastructure in place in fiscal year 2007 and 
all of it in place in fiscal year 2008. We will then 
be able to identify various recidivism-reduc- 
tion strategies used by different districts and 
will be able to track large cohorts of offenders 
to determine what strategies appeared to make 
a difference under what circumstances for 
what types of offenders. 

Pending completion of their own internal 
outcome-based system, federal probation and 
pretrial services officers are eagerly adopt- 
ing research-supported recidivism-reduc- 
tion strategies identified in the growing body 
of research into state and local programs 
known as Evidence-based Practices.°> When 
the federal outcome-based system is fully 
implemented, it will be possible to deter- 
mine whether those evidence-based practices 
made a significant difference in measurable 
outcomes and, if so, whether they should be 
implemented in other districts. 

What follows is a report prepared by Cali- 
ber, An ICF Consulting Company under con- 


' The terms “outcome-based” and “results-based” 
are used interchangeably. 
> The term “evidence-based practice” implies 
that 1) there is a definable outcome(s); 2) it is mea- 
surable; and 3) it is defined according to practical 
realities (e.g. recidivism). 


tract with the Administrative Office of the 
United States Courts. This report provides 
some useful background and describes precise- 
ly how we have buiit and will continue to build 


our outcome-based system. Much of the body 
of the report is reproduced below, except for 
most of the attachments and section 6, which 
details nuts-and-bolts implementation.] 


STATUS OF PROCESS FOR ESTABLISHING OUTCOME INDICATORS 
FOR SUPERVISION FUNCTIONS 
(Black=Completed; Gray=In Progress; White=Future Task) 


Activity 


Section Product 


(Subject To Draft and Review Process) 


Operationalize Outcomes 


Goal-setting 
Phase 


Develop Analytical Model 


identify available data 


Framework Design 


Technical 


Phase 


sources and elements 


Identify Additional Data 
Requirements 


Analyze available data 


System Upgrade 


Current Status and Targets 
For Each Outcome Measure 


Analytical 
> Phase 


Adjustments 


Annual Assessments of 
Progress and Necessary 
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[The federal probation and pretrial services 
system is making steady progress towards a 
comprehensive outcome-measurement sys- 
tem'! that will allow the federal judiciary’s 
policymaking body, the Judicial Conference 
of the United States, and chief probation and 
pretrial services officers and system managers 
to make decisions based on ongoing empirical 
analysis of reliable and pertinent data. We 
expect to have much of the data collection 
infrastructure in place in fiscal year 2007 and 
all of it in place in fiscal year 2008. We will then 
be able to identify various recidivism-reduc- 
tion strategies used by different districts and 
will be able to track large cohorts of offenders 
to determine what strategies appeared to make 
a difference under what circumstances for 
what types of offenders. 

Pending completion of their own internal 
outcome-based system, federal probation and 
pretrial services officers are eagerly adopt- 
ing research-supported recidivism-reduc- 
tion strategies identified in the growing body 
of research into state and local programs 
known as Evidence-based Practices.** When 
the federal outcome-based system is fully 
implemented, it will be possible to deter- 
mine whether those evidence-based practices 
made a significant difference in measurable 
outcomes and, if so, whether they should be 
implemented in other districts. 

What follows is a report prepared by Cali- 
ber, An ICF Consulting Company under con- 


! The terms “outcome-based” and “results-based” 
are used interchangeably. 


2 The term “evidence-based practice” implies 
that 1) there is a definable outcome(s); 2) it is mea- 
surable; and 3) it is defined according to practical 
realities (e.g. recidivism). 


tract with the Administrative Office of the 
United States Courts. This report provides 
some useful background and describes precise- 
ly how we have built and will continue to build 


our outcome-based system. Much of the body 
of the report is reproduced below, except for 
most of the attachments and section 6, which 
details nuts-and-bolts implementation.] 


STATUS OF PROCESS FOR ESTABLISHING OUTCOME INDICATORS 
FOR SUPERVISION FUNCTIONS 
(Black=Completed; Gray=In Progress; White=Future Task) 


Activity 


Operationalize Outcomes 


Section Product 
(Subject To Draft and Review Process) 


Goal-setting 
Phase 


Develop Analytical Model 


identify available data 


Framework Design 
Technical 


Phase 


sources and — 


Identify Additional Data 
Requirements 


Analyze available data 


System Upgrade 


Current Status and Targets 
For Each Outcome Measure 


Analytical 
Phase 


Adjustments 


Annual Assessments of 
Progress and Necessary 
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1. Project Background 


The federal probation and pretrial services 
system is developing a results-based man- 
agement framework that will, in the future, 
allow it to better assess performance—and 
make programming and resourcing deci- 
sions—based on what it accomplishes rath- 
er than solely on what it does. The flow 
chart below shows the steps involved in 
developing the framework, and highlights 
where we are in the process. 

This focus on results, and the work 
done to date to define the system’s mis- 
sion, goals and desired outcomes, stem 
from a number of complementary influ- 
ences and projects. 

1. In 1999, the Administrative Office 
entered into a contract with a team 
of independent consultants, led by 
IBM, to conduct a strategic assess- 
ment of the federal probation and pre- 
trial services system. The overarching 
recommendation from that assess- 
ment—presented first to the Admin- 
istrative Office in 2003—was that the 
federal probation and pretrial ser- 
vices system become a results-driven 
organization with a comprehensive 
performance measurement system. 

2. In 2000, the AO Director appointed 
an Ad Hoc Supervision Work Group 
comprised of supervisors, deputies 
and chiefs from seven districts and 
a representative of the Federal Judi- 
cial Center to update the supervision 
policy monographs. As part of its 
work, the group reviewed relevant 
statutes and mission statements to 
identify the desired outcomes and 
goals to be served by the pretrial ser- 
vices and post-conviction supervision 
functions. These outcomes and goals 
were incorporated into revised super- 
vision policy documents approved by 
the Judicial Conference of the United 
States in 2003. 

3. Strategic planning sessions were con- 
ducted at the 2000 and 2002 Fed- 
eral Judicial Center’s National Chiefs 
Conferences. The 2000 conference 
produced a “Desired Futures” road- 
map the first element of which was: 
“Desired Outcomes are clear, mea- 
sured and results are communicat- 
ed.” The 2002 conference resulted in 


a “Charter for Excellence” that sets 
forth broad system goals and values 
(see Attachment 1). 

4. In September 2003, one of the IBM 
strategic assessment consultants facil- 
itated a strategic planning session 
at a meeting of the Chiefs Advisory 
Group to translate the broad “Charter 
for Excellence” statements into more 
specific “Operational Goals.” 

The operational goals developed by the 
Chiefs Advisory Group were combined with 
the desired outcomes set forth in the revised 
supervision monographs to form the basic 
structure of the results-based management 
framework (see Attachment 2). This con- 
cluded the initial goal-setting stage of the 
framework development process.* 

The current stage of the process is tech- 
nical: The development of operational 
definitions and associated measures for 
each “desired outcome;” and of statistical 
approaches to analyze the information that 
will assure “apples-to-apples” comparisons 
and allow benchmarking with other pro- 
grams. To assist in this phase: 

w In March 2004, the AO director appoint- 
ed an Ad Hoc Expert Panel on Out- 
come Measurement Methodology. This 
panel is comprised of the directors 
of research for the Federal Bureau of 
Prisons, the Federal Judicial Center 
and the District of Columbia’s Court 
Services and Offender Supervision 
Agency; and academics and Admin- 
istrative Office staff with extensive 
backgrounds in criminal justice/sub- 
stance abuse research and government 
performance measurement systems. 
The panel met twice in 2004 and pro- 
duced recommendations for measuring 
the concepts of recidivism and sub- 
stance abuse (see Attachment 3). 

@ In April 2005, the AO entered into a 
contract with Caliber Associates to pro- 
vide additional support in developing, 
coordinating and presenting recom- 
mendations for the technical aspects 
of the results-based framework. 

This final product from the technical 
phase includes recommendations, to be 
circulated to system staff and stakeholders 


> As shown by the framework development flow 
chart, the process is iterative. All references to 
“completion” refer to the initial deveiopment 
process. 


for broad system comment, that span the 

lifecycle of the results-based management 

framework. The report addresses: 

m= How to measure a variety of outcomes— 
including offender compliance, positive 
change, and crime reduction; 

mw What data are needed to construct the 
recommended measures; 

mw What analytical methodologies can be 
used to assess how these results are 
affected by supervision interventions as 
well as a variety of case, offender and 
community factors? 

m What tasks are necessary to fully imple- 
ment the framework; and 

ws How to institutionalize the framework 
within the federal probation and pre- 
trial services system? 

The recommendations represent “state 
of the art” measurement and analytical 
approaches that are being used by other 
performance-based systems, program 
evaluations and/or academic research in 
criminal justice and related areas such as 
substance abuse. 


2. Post-conviction Supervision 
Logic Model 


Building on the results of the goal-setting 
stage of this project, the next step was 
to develop a logic model that depicts the 
underlying assumptions about how “what 
the system does” affects what it is trying to 
accomplish; and what other factors—e.g., 
characteristics of the offenders to be super- 
vised, the requirements and restrictions of 
their sentences, and the system resources 
devoted to carrying out the supervision 
mission—are expected to influence this 
relationship. The logic model for post-con- 
viction supervision (see next page) is used 
throughout this document to illustrate the 
technical concepts that are incorporated 
into the framework design. 

This logic model has been refined twice 
since its development following the goal 
setting stage based on input from the 
Expert Panel and Caliber Associates. It 
will continue to be a work in progress 
that evolves to incorporate feedback from 
system staff and stakeholders, and results 
from empirical testing of the posited 
relationships. A similar logic model was 
developed for pretrial services; which will 
be incorporated into the framework and 
follow a similar evolution process. 
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2.1 Components of the Logic 
Model 


The post-conviction supervision logic model 
has six components: inputs, process (activi- 
ties), process outcomes, intermediate out- 
comes, ultimate outcomes, and mission. 
Each component is described below. 


Inputs 


Inputs are characteristics of the offender 
population and the working environment 
that are hypothesized to affect expected 
outcomes regardless of system interventions. 
For example, prior research indicates that 
offenders with a lengthy prior record are 
more likely to become re-involved in crimi- 
nal activity than those with no or a mini- 
mal prior record. This leads to a working 
assumption that, regardless of supervision 
interventions, districts that have a high per- 
centage of first offenders will have a lower 
recidivism rate than those with a low per- 
centage of first offenders. 

Inputs are used in the analytical model as 
“control” variables to account for the effects 
of factors that explain differences in out- 
comes across offices, districts and time that 
are not related to system interventions. They 
may also be used as stratification categories 
to display outcomes based on key group- 
ings, e.g., by type of supervision (probation/ 
parole/supervised release) in reports. 

The current model includes as inputs 
those factors identified in the research and 
program evaluation literature as related to 
criminal justice goals. These include: 
mw Offender characteristics (e.g., prior 

record, employment; family/community 

connections, demographics); 

w Characteristics of the instant offense (e.g., 
class and category); 

m Sentence parameters (e.g., length of pris- 
on and supervision terms imposed and 
served, conditions imposed); 

Office/community characteristics 
(e.g., location, size, socio-economic 
indicators); 

m Officer characteristics (e.g., experience, 
demographics, education); 

m Supervision resources (e.g., supervision 
staffing, contract budgets, technological 
support). 

The inputs categories will be further 
defined and the categories and their specific 
elements assessed for adequacy by system 
staff and stakeholders as part of the frame- 
work implementation (see Subsection 6.1). 


These inputs will also be applied, as appro- 
priate, to the pretrial services model. 


Process (Activities) 


Process refers to activities undertaken by the 
system—practices, programs and interven- 
tions—that implement the supervision func- 
tion. As an example: An officer conducts an 
initial assessment investigation, identifies 
lack of stable employment as a risk, and 
refers the offender for job counseling or to a 
job referral agency. In the analytical model, 
the process variables define “what we do” for 
purposes of assessing the basic relationship 
of how “what we do,” relates to what we are 
trying to accomplish. 

The current logic model includes only 
the most general process categories, e.g., 
investigation, assessment, monitoring, refer- 
ral, and assistance. Detailed input on the 
specific processes that should be included in 
the model will be sought from system staff 
and stakeholders—the experts in identifying 
and defining salient system activities—as 
part of the initial next step in implementing 
the framework. 


Process Outcomes 


Process outcomes describe offender actions 
that occur as a result of system activities. 
For example, in response to an employ- 
ment referral, the offender registers with an 
employment service or completes “x” 
of employment counseling. 

Process outcomes enter the analytical 
model as both an outcome of the service 
delivery process and as an input (control) for 
assessing interim and ultimate outcomes. For 
example, “number of hours of employment 
counseling” is a measure of how successful an 
officer's employment referrals are in engag- 
ing offenders in employment services. This 
measure is also a “control” when addressing 
the extent to which an interim outcome, such 
as improved employment, might be attrib- 
uted to the supervision intervention. 

The model includes only broad categories 
of compliance with each of the four major 
types of conditions: Restrictions, correctional 
programming, service, and financial. These 
will be expanded as a result of comments 
received from system staff and stakeholders. 


hours 


Intermediate Outcomes 


Intermediate outcomes are changes in 
offender behavior that are themselves desir- 
able and believed also to be precursors of the 
ultimate outcomes. Specifically, the inter- 


mediate outcomes in the model are defined 
as a desirable change in a circumstance that 
has an empirically proven relationship to 
successful supervision and that is within an 
officer’s authority and sphere of influence. 

Three intermediate outcomes are includ- 
ed in the current model: Reductions in 
substance abuse, improvements in employ- 
ment, and improvements in other life skills. 
Each of these changes is believed to relate 
to the likelihood of criminality during the 
period of supervision and beyond. Each 
is also expected to relate to certain “sen- 
tence execution” outcomes (e.g., improved 
employment > enhanced earning capacity 
> more money to pay restitution). 

The list of intermediate outcomes cur- 
rently in the model does not include all of 
the operational goals that emerged from the 
Chiefs Advisory Group’s strategic planning 
session in 2003. This results from the limita- 
tions that were subsequently attached to the 
definition of intermediate outcomes. For 
example, “Improvements in Mental Health” 
is not listed as an intermediate outcome 
because there is no empirical association 
between general mental health problems and 
criminality. This does not mean that mental 
health issues will not be considered, but 
rather that their relationship to ultimate out- 
comes will be considered in terms of offend- 
er and sentence characteristics (e.g., mental 
health needs and conditions as inputs), refer- 
rals for mental health counseling (process), 
and the offender’s participation in that pro- 
gramming (process outcomes). 


Ultimate Outcomes and Mission 


The ultimate outcomes are set forth in The 
Supervision of Federal Offenders, Monograph 
109, which establishes Judicial Conference 
policies related to post-conviction supervi- 
sion. These outcomes are: To execute the 
sentence and to protect the community dur- 
ing the period of supervision and beyond. 

These outcomes stem directly from the 
system mission endorsed by the Judicial 
Conference in September 1993—To protect 
the public and to assist in the fair admin- 
istration of justice—supplemented by the 
statutory provisions that establish the duties 
of probation officers and the purposes that 
community sentences are to serve. 

Two of the ultimate outcomes—mini- 
mizing criminal activity during the period 
of supervision and beyond—relate to the 
system mission to protect the public. The 
other ultimate outcomes measure the impact 
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of compliance with release conditions (e.g., 
restoration of victims). These serve as direct 
measures of sentence execution and sur- 
rogates for the mission to assist in the fair 
administration of justice—with no expecta- 
tion that they will affect the system’s public 
protection mission. Examples from the lit- 
erature of more precise operational mea- 
sures for process outcomes, intermediate 
outcomes, and ultimate outcomes are pro- 
vided in Section 3 and further delineated in 
Attachment 5: Key Element Definitions and 
Attachment 6: Data Matrix. 


2.2 Relationships among 
Components 


The arrows in the logic model indicate the 
specific expected relationships between 
components that the analytical model will 
be designed to test. As described in Section 
4, statistical techniques will be applied to test 
the relationships depicted. 

The analysis will test a complete thread of 
the model, starting from left to right. Basic 
and advanced techniques will be used to 
test both direct and indirect and unidirec- 
tional and bidirectional relationships, while 
controlling for inputs that are primarily 
static and outside the control of the officer. 
The results will move the system beyond 
a description of the offender population 
and individual outcomes to a more com- 
plex assessment of the “theory of change” 
and the interconnectedness of process and 
outcomes for post-conviction supervision. 
Similar relationships will be tested for pre- 
trial services. 


3. Operationalizing 
Post-conviction Supervision 
Outcomes 


This section further defines the process, 
intermediate, and ultimate outcomes in 
measurable terms. In order to empirically 
test the hypothesized relationships between 
post-conviction processes (activities) and 
outcomes of the offender population, it is 
necessary to first identify appropriate mea- 
sures for each outcome. A data matrix was 
developed based on a review of current eval- 
uation research.‘ The matrix organizes each 
outcome into the following categories: 

a Type—The concept of interest. 


* While the focus of the measurement matrix is 
on outcomes, examples of measures for the inputs 
depicted in the logic model have been provided. It 
is expected that these will be revised and refined 

- based on review and comment from system staff 
and stakeholders. 


m Definition—A brief description of each 
concept. The descriptions are important 
to ensure standardization in how data are 
defined across districts. 

= How Operationalized—How the concept 
will be measured. 

a Data Element—The specific piece of data to 
be captured so that certain tests can be per- 
formed to help answer questions of inter- 
est. Data may be reported in days, weeks, 
dollars, cents, or by selecting “yes” or “no” 
options or other response categories. 

w Level of Measurement—The level at which 
the data will be measured (nominal, inter- 
val, ratio, ordinal). This is important for 
determining the type of analysis that can 
be performed using each measure. 

mw Data Source—Where the identified data 
may be obtained. In some cases, the data 
may not currently be collected and there- 
fore, the data source will need to be deter- 
mined by system staff and stakeholders. 


3.1 Process Outcomes 


A process outcome represents the immedi- 
ate outcome for the offender as a result of 
system activities. The four process outcomes 
are: compliance with restrictive conditions, 
participation in correctional programming, 
compliance with service conditions and com- 
pliance with financial conditions. Examples 
of the types of data that could be collected 
for each outcome are described below. 

= Comply with restrictive conditions (e.g., 
home confinement conditions, halfway 
house placement, employment condi- 
tions, prohibition of contact with vic- 
tim/minor/associates, remote location 
monitoring, and nighttime/weekend 
jail requirements). The process outcome 
measures for all conditions will include: 
a dichotomous measure of compliance 
(complied or did not comply) and num- 
ber of noncompliant events involving 
condition. An additional measure for any 
condition with an associated time com- 
ponent will be calculated based on days 
imposed vs. days completed. The process 
outcomes will be calculated separately 
for each restrictive condition and for the 
restrictive category as a whole. 

m Participate in correctional program- 
ming—Measures of this process outcome 
will include: number of days from start 
to end of program (duration); number 
of hours of service (per week and total); 
number of sessions attended vs. sessions 
scheduled; end-of-treatment provider 


assessment of quality of participation 

(5-point scale); and completion status 

(successful/unsuccessful completion). 

The results will be presented by type 

of program (substance abuse, education/ 

employment/job training, mental health 
treatment, sex offender, life skills, basics), 
funding source (no cost, contracted, other 
government program, or self-insured), 
and modality of treatment (inpatient, 
individual, group, family-individual, or 
family-group). 

= Comply with service conditions—Service 
conditions consist of a requirement for the 
offender to complete hours of non-paid 
community service. Compliance with 
service conditions will be determined by 
number of community service hours com- 

pleted vs. the number of hours imposed. A 

dichotomous compliance status measure 

(complied or did not comply) can then be 

calculated based on whether the offender 

successfully completed the imposed hours 
of non-paid community service. 

= Comply with financial conditions—(e.g., 
fine, restitution, special assessment, no 
new debt/credit, cooperate with IRS; 
child support enforcement) Measures for 
all conditions will include: a dichoto- 
mous measure of compliance (complied 
or did not comply) and number of non- 
compliant events involving condition. An 
additional measure for any financial con- 
dition with an associated amount will be: 
amount expected by payment schedule 
vs. amount paid. The process outcomes 
will be calculated separately for each 
financial condition and for the financial 
category as a whole. 

The process outcomes data described 
above will be used in the analysis described 
in Section 4 as both dependent variables 
(predicted outcome of system activities) and 
independent variables (predictors of inter- 
mediate outcomes). 


3.2 Intermediate Outcomes 


An intermediate outcome represents the 
expected immediate result of the process 
outcomes. Examples of the data to be col- 
lected for the three intermediate outcomes 
depicted in the post-conviction supervi- 
sion logic model—reduce substance abuse, 
improve employment, and improve other life 
skills—are described in greater detail below. 
m Reduce substance abuse—Measures of 

substance use during the period of super- 

vision will include drug test results, 
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self-admissions, and substance-related 

re-arrests. These measures will be used 

to create a dichotomous measure of sub- 
stance use (used or did not use). The out- 
comes will be presented by time of event 

(before treatment, during treatment and 

after treatment) and type of substance. 

The analysis will assess “change” by 

comparing substance use during super- 

vision with the offender’s status at the 
start of supervision based on such fac- 
tors as: prior diagnosis of addiction/ 
abuse (Y/N); evidence of use at time 
of instant offense based on admission, 
positive pretrial/presentence drug test, 
offense involving drug/alcohol use? (Y/ 
N); drug(s) of choice; and number of 
prior treatment experiences. 
= Improve employment—Measures of em- 
ployment may include: change in employ- 
ment status (unemployed, unemployed but 
seeking employment, part-time employ- 
ment, full-time employment), length of 
employment (calculated as the percent of 
time offender was employed during period 
of supervision), and amount of wages. 

m= Improve other life skills—There is still 

a question as to the type of life skills 

that should be identified as outcomes 

in the logic model. Specifically, which 
life skills meet the intermediate outcome 
criteria: “A desirable change in a circum- 
stance that has an empirically proven 
relationship to successful supervision and 
that is within an officer’s authority and 
sphere of influence.” Potential straight- 
forward areas are level of educational 
and new vocational/advocational skills. 

Can or should the area be expanded to 

encompass such topics as family stability 

and community stability without over- 
stepping appropriate bounds on officer 
authority? 

As with process outcomes, the above 
intermediate outcomes will be tested as 
dependent variables (the result of partici- 
pation in correctional programming) and 
independent variables (predictors of future 
criminal activity and victim restoration). 
As shown in the logic model, the relation- 
ship (as either independent or dependent 
variables) between intermediate outcomes 
also will be tested. Additionally, the inter- 
mediate outcomes will be tested as mediat- 
ing variables between process outcomes 
(participation in correctional program- 
ming) and ultimate outcomes (minimized 


activity). 


3.3 Ultimate Outcomes 


An ultimate outcome is the long-term 
result of the system activities for the 
offender. The ultimate outcomes also 
reflect achievement of the mission of the 
federal probation and pretrial services 
system. The five ultimate outcomes that 
best reflect the mission include: minimize 
criminal activity during the period of 
supervision, minimize criminal activity 
beyond the period of supervision, maxi- 
mize victim restoration, defray costs to the 
government, and maximize compliance 
with release conditions. The analysis of 
data on these ultimate outcomes will help 
system staff and stakeholders better assess 
if the missions of protecting the public 
and assisting in the fair administration of 
justice are being achieved. Each ultimate 
outcome is discussed below. 

m Minimize criminal activity during the 
period of supervision—The primary 
measure of criminal activity during 
the period of supervision is whether an 
offender was arrested for a new offense. 
Technical violations are not counted 
as a new offense. The analysis will also 
examine the time to arrest (length of 
time before the arrest for a new offense). 
The results will be presented overall and 
by offense type (e.g., violent, property, 
drug, public order, weapon, immigra- 
tion) and offense level (felony, misde- 
meanor, petty). 

@ Minimize criminal activity beyond the 
period of supervision—Similar measures 
for criminal activity beyond the period 
of supervision will be reported as 
described above. 

m Maximize community restoration—Com- 
munity restoration will be measured as 
the amount of payments made to victim 
special assessments and, where required, 
the Victims’ Crime Fund. In addition, 
community restoration will be measured 
by the amount of fines paid, etc. 

m Maximize compliance with release 
conditions—Compliance with release 
conditions will include: a dichotomous 
measure of any noncompliance (Y/N), 
number of instances of noncompliance, 
time to first noncompliance (months), 
and time free of noncompliance at 
inactivation/termination of supervi- 
sion (months). 

Ultimate outcome data enable system 
staff and stakeholders to test whether the 
system activities (processes) are leading to 


the long-term outcomes that the federal pro- 
bation and pretrial services system is tasked 
with achieving. Furthermore, these data will 
allow system staff and stakeholders to assess 
how well they are doing at meeting their 
mission to protect the public and fairly 
administer justice. 


4. Data Analysis Plan 


The data analysis plan describes a recom- 
mended approach to testing the relation- 
ships depicted in the logic model. That is, the 
plan presents the statistical techniques, pro- 
gressing sequentially from simpler to more 
sophisticated levels of analysis, that will 
provide system staff and stakeholders with 
both a description of the offender population 
and outcomes and a more complex assess- 
ment of the hypothesized “theory of change” 
and interconnectedness of process and out- 
comes (e.g., direct, indirect, unidirectional 
and bidirectional relationships) depicted in 
the post-conviction supervision logic model. 
Specifically, the plan describes basic and 
advanced statistical techniques that can be 
applied to test these relationships, while con- 
trolling for inputs that are primarily static 
and outside the control of the probation offi- 
cer. The analysis plan is organized into three 
stages: data quality, data reduction, and data 
analysis. Each is presented below. 


4.1 Data Quality 


Before any data analysis is conducted, all 
data will need to undergo standard checks 
for quality to make sure there are no data 
entry or transmission errors. Checking for 
data quality is typically a two-step process 
that involves detection and then correction 
of errors in a data set. Cleaning and prepar- 
ing data is an often neglected but extremely 
important step in the analysis process. The 
saying “garbage-in-garbage-out” is particu- 
larly applicable where large data sets col- 
lected via some automatic methods (e.g., via 
National PACTS Reporting, National Crime 
Information Center (NCIC), etc.) serve as 
the primary input into the analysis. The 
most common sources of error include data 
entry errors, such as typing errors, column 
shift (data for one column being entered 
under the adjacent column), which often 
results in invalid responses, general coding 
errors, which may occur during data collec- 
tion or entry and may be difficult to detect 
unless you look for outliers or unusual 
relationships between variables, and not 
recoding missing data, which can result in 
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inflated mean scores and the like. The first 
step to ensuring quality data begins with 
ensuring clear understanding and proce- 
dures for collecting and entering data and 
systematic review before data are submit- 
ted or made available for extraction. Once 
data are submitted, the next step in the data 
quality assurance process is to detect and 
clean the data for errors. 


Error Detection 


There are three common procedures for 
detecting data errors that should be fol- 
lowed. These include: 

m Review descriptive statistics. Using soft- 
ware, such as SPSS, the following review 
of descriptive statistics can identify data 
errors: 

@ Look at minimum and maximum val- 
ues to determine if data fall outside 
the acceptable range. 

@ Look for presence of 0’s and 999’s (or 
9999, etc.) shown in frequency tables, 
graphs or histograms to indicate miss- 
ing values. 

@ Look at means, medians, and standard 
deviations. For example, if the median 
differs much from the mean value, it 
is important to investigate the overall 
distribution of values for outliers. 

@ Assess frequencies. By examining 
frequencies, it is possible to detect 
unequal distribution in categories 
such as age, sex, and race that are out- 
side what would normally be expected 
for a particular population. 

m Conduct logic checks. Errors in data 
can be detected simply by determining 
whether the responses seem logical. For 
example, you would expect to see 100% 
of responses, not 110%. 

m Examine bivariate outliers. Some data 
errors only appear when two variables 
are compared. To detect such error, it is 
important to look for outliers, or values 
of a variable that are far different from 
the expected values. These errors can be 
detected by examining bivariate asso- 
ciations and scatterplot graphs to check 
for deviations in expected associations 
between variables. 

Once the data errors are detected, there 
are several techniques that should be fol- 
lowed for correcting the errors. These are 
described below. 


Error Correction 


Once errors are detected, it is important to 
know how to handle them appropriately so 
the data can be analyzed without losing their 
integrity or robustness. There are slightly 
different ways to deal with error in indepen- 
dent (or predictor and control) variables and 
dependent (outcome) variables. 


Independent Variables 


When there are a minimal number of errors, 
the values are generally recoded to “miss- 
ing.” What this means is that the suspicious 
values are counted as missing data since they 
are not within an acceptable range. If there 
are many error values, then it is important 
to check to see if some of the values of the 
outcomes are the same for missing and non- 
missing values for the independent variables. 
If so, then there is less chance of bias in 
the analysis. If not, then it is possible that 
the data is not good and that the variable 
should be discarded or used with caution. 
Various imputation-based procedures to fill 
in missing values (series mean, mean or 
median of nearby points, linear interpola- 
tion, linear trend at point) will need to be 
considered. Other more complex imputa- 
tion-based procedures (regression imputa- 
tion, non-ignorable missing-data models, 
Heckman’s two-step statistical process) may 
also need to be used.° 


Dependent variables 


If there are few data errors, values can be set 
to “missing” using one of the imputation- 
based procedures determined to be most 
appropriate. However, it is important to use 
caution when setting many values to “miss- 
ing,” especially if multiple variable analysis 
will be conducted. It may be necessary to 
set the error values for the outcome or inde- 
pendent variable to the data set mean or the 
group mean (maybe by age, type of offender, 
etc.). This should result in a histogram with 
a more normal distribution of values. Once 
quality of the data has been checked and the 
necessary steps taken to correct for prob- 
lems or errors, data analysis can proceed 
from basic to more advanced techniques 
described below. 


While most of the imputation-based proce- 


dures described are available through SPSS, other 
more advanced procedures may require addition- 
al statistical software, such as SAS or Stata. 


4.2 Data Reduction 


Data reduction is a process often applied 
where the goal is to aggregate or amalgam- 
ate information contained in large data 
sets into more manageable and reliable 
information. Data reduction techniques 
can include simple tabulations, aggrega- 
tions, or more sophisticated techniques, 
such as clustering, principal components 
analysis, and path analysis. Each of these 
recommended data reduction methods is 
described below. 


Aggregation 

Aggregating or transformations of data are 
techniques often used to reduce or opti- 
mize your data. Data can be aggregated, 
for example, by subgroups to move from 
individual case records for thousands of 
individual offenders to mean scores for sub- 
groups of offender populations based on 
certain criteria (e.g., criminal history, dis- 
trict, gender, etc.). Additionally data can 
be transformed by creating dichotomous 
variables (presence/absence of new offense) 
from continuous variables (number of new 
offenses) or composite scores or constructs 
from multiple measures (risk assessment 
score). When transforming data, especially 
creating composite measures or constructs, 
it is important to use other techniques, 
such as those described below, to determine 
which variables should be combined to cre- 
ate a new variable. 


Cluster Analysis 


Cluster analysis is a multivariate analysis 
technique that seeks to organize informa- 
tion about variables so that relatively homo- 
geneous or similar groups, or “clusters,” 
can be formed. To use this technique, it 
is important that the clusters formed be 
highly internally homogenous (members are 
similar to one another) and highly exter- 
nally heterogeneous (members are not like 
members of other clusters). Cluster analysis 
can be used to combine “similarity” mea- 
sures as well as measures that are proxies or 
associations. However, it is first necessary to 
standardize your data since clustering often 
involves combining items measured on dif- 
ferent scales. 


Principal Component Analysis 

Principal component analysis is used to 
reduce the number of variables or factors for 
inclusion in your analysis. Specifically, this 
method of analysis is used to combine two or 
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more correlated variables into a single factor. 
Principal component analysis helps reduce 
redundancy in your measures by identifying 
and combining those that are highly corre- 
lated into a new variable, while minimizing 
the variance around the new variable.° 


Path Analysis 


Once you have arrived at a set of mea- 
sures that represent the variables of interest 
(inputs, processes, process outcomes, inter- 
mediate outcomes, and ultimate outcomes), 
it is necessary to determine whether the 
relationships or paths among those variables 
presented in the post-conviction supervi- 
sion logic model can be supported by the 
data. That is, do the data fit the model? Path 
analysis calculates a path coefficient, which 
shows the direct effect of an independent 
variable on a dependent variable in the 
path model. This information is then used 
to calculate a goodness-of-fit statistic. The 
statistic determines how well the data fit the 
model, or stated another way, the statistics 
help identify the best fitting models for the 
data.’ For example, based on the analysis, 
it may be determined that certain predicted 
or hypothesized relationships shown in the 
logic model are not supported by the data. 
Or, the analysis may uncover other relation- 
ships (direct or indirect) that are not current- 
ly represented in the model. It is important 
to test these relationships using path analysis 
or other more sophisticated techniques, such 
as structural equation modeling (a variation 
of path analysis involving multiple indica- 
tors of variables in the model), to ensure the 
“big picture” is accurate before attempting 
to further describe your data and examining 
more minute relationships among variables 
(see basic and advanced analysis sections 
below).8 Based on the results, it may be 


For principal component analysis, variance 


i maximizing (varimax) rotation should be select- 
ed as the extraction method. Additionally, differ- 
ent criterion (Kaiser criterion, scree test) should 
be examined to determine which solution makes 
the best sense, often one retaining more factors 
(Kaiser) than the other (scree). 


7 There are over 25 goodness-of-fit calculations 


available through the SPSS add-on AMOS. The 

most common used are model chi-square (not sig- 
} nificant indicates model fit), GFI (goodness-of-fit 
index) (greater than or equal to .90 to accept the 
model), and CFI (comparison-fit-index) (greater 
i than or equal to .90 to accept the model). Any or 
| all of these should be compared. 


8 Path analysis and structural equation model- 


mmemn'ing can be conducted using SPSS and the SPSS 
add-on software AMOS as well as SAS. 


necessary to modify the model (add and/or 
remove arrows depicting relationships, add/ 
or remove variables, etc.). 


4.3 Data Analysis 


The data analysis stage is presented in two 
parts: basic analysis and advanced analysis. 
Each is presented below. To help illustrate 
the rationale behind each recommended 


m What are the demographics (e.g., distri- 
bution by race, age, gender, risk level, seri- 
ousness of previous criminal offense(s)) of 
the offender population entering supervi- 
sion with an identified substance abuse 
problem? 

mw What percentage of offenders with an 
identified substance abuse problem partic- 
ipate in substance abuse treatment? What 


Participate in 
correctional 
programming 


INPUTS ——® PROCESS —> 


Reduce Minimized criminal 
substance activity during/ 
abuse beyond supervision 


analytic technique, examples drawn from 
the substance abuse “thread” of the logic 
model shown below are discussed. 


Basic Analysis 


Once the steps necessary to reduce or 
optimize the available data have been com- 
pleted, the first or basic level of analysis 
to be conducted of the input, process, and 
outcome data will use descriptive statistics. 
Descriptive statistics should be used to: 
mw Describe the basic features of the data 
m Provide simple quantitative summaries 

about the measures 
mw Provide the basis for subsequent data 

analysis. 

In general, descriptive statistics will 
describe “what is,” or what the data show 
with respect to a given variable. 


Univariate Analysis 


Univariate analysis of the data is used to 

examine across cases or groupings (e.g., 

offender cohorts, offender type, districts, 

etc.) one variable (or outcome) at a time. The 

major characteristics or descriptive statistics 

to be examined for each variable include: 

w Distribution (frequency distribution, 
percentages) 

m Central tendency (mean, median, mode) 

w Dispersion (e.g., range, variance, stan- 
dard deviation). 

These descriptive statistics will provide 
system staff and stakeholders with basic 
information about the offender population 
as a whole.’ 

Testing the “thread” of the logic model 
shown above, the following sample questions 
can be answered using univariate analysis: 


° Most univariate analysis can be conducted 


using the Descriptive Statistics option within the 
Analyze function of SPSS. 


is the average dosage of substance abuse 
treatment received across this offender 
population? What is the most common 
modality of treatment provided? 

ws What percentage of this offender popula- 
tion abstains from using illegal or other 
restricted substances during the period of 
supervision? 

w For those offenders with a history of sub- 
stance abuse who are identified as using 
substances during supervision, what 
are the most common drugs of choice 
identified? 

a What percentage of this offender popula- 
tion commits a new offense during 
supervision? Following release from 
supervision? 

m What is the average length of time to a 
new offense for this population? 
Additionally, by plotting the data using 

scatterplots, histograms, bar graphs, or 
other options, it will be possible to identify 
potential differences or variation in outcome 
responses for subgroups of the offender pop- 
ulation for further analysis. For example, 
are there visible differences in the distribu- 
tion of offenders with a history of substance 
abuse with a new arrest while under supervi- 
sion? Are there differences in the distribu- 
tion of this offender population by reasons 
for ending supervision (e.g., successfully 
completed on time, early release, revocation, 
absconded)? Is there wide variation in the 
amount of treatment hours provided for this 
population of offenders? 


Bivariate Analysis 


The next step in the analysis is to begin test- 
ing the significance of the covariates (inputs, 
process (activities), process outcomes) on the 
intermediate (e.g., substance use, employ- 
ment, life skills) and ultimate outcomes 
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(criminal activity, victim restoration, com- 
pliance with release conditions) using bivari- 
ate analysis techniques, such as correlations, 
cross tabulations with chi-square statis- 
tics, t-tests, and analysis of variance. These 
techniques are recommended because they 
are relatively easy to conduct and provide 
straightforward interpretations.'° The bivar- 
iate analysis will provide preliminary indica- 
tion of the relationship between independent 
and dependent variables to be further tested 

with the inclusion of control variables (e.g., 

offender characteristics, offense characteris- 

tics, sentence characteristics, etc.) with more 
advanced analysis. 

For example, analysis of variance 
(ANOVA) is used to test differences between 
two or more means. ANOVA helps to deter- 
mine which variables have a significant 
influence on an outcome, and/or how much 
of the variability in the outcome or depen- 
dent variable is attributable to each factor. 
The two types of ANOVAs that can be used 
to test such differences include: 

1 One-Way ANOVA. A one-way analy- 
sis of variance is used when the data are 
divided into groups according to only one 
factor (e.g., level of risk, type of correction- 
al programming, compliance status). The 
questions of interest are usually: Is there a 
significant difference between the groups? 
and if so, which groups are significantly dif- 
ferent from which others? Statistical tests are 
provided to compare group means, group 
medians, and group standard deviations." 

@ Multifactor ANOVA (MANOVA). When 
more than one independent or control 
variable is present and the factors are 
crossed (e.g., instant offense by prior 
record), a multifactor ANOVA is appro- 
priate. Both main effects and interactions 
between the factors may be estimated. 
Using the substance abuse example, a 

one-way ANOVA might be conducted to test 

differences in mean number of substance 
abuse relapses between those offenders with 

a history of substance abuse who participate 

in individual substance abuse treatment and 

those who participate in other modalities 


'° Bivariate analysis can be conducted by using 
the Analyze function of SPSS and selecting 
Descriptive Statistics, Compare Means, and Cor- 
relate options. 


'"! When comparing means using ANOVA, mul- 


tiple range tests are used, the most popular of 
which is Tukey’s HSD procedure. 


of treatment (group, family, and other).!* A 
detailed example and interpretation of the 
results from an ANOVA are presented in 
Attachment 7. 

Another useful bivariate analysis to be 
considered is the negative binomial regres- 
sion model. This technique can be used to 
test whether different subgroups of offenders 
commit more frequent acts of noncompli- 
ance with conditions of supervision, relapse, 
and/or new offenses or other outcomes than 
other subgroups. Negative binomial regres- 
sion models were developed specifically for 
the kind of distribution of failures that are 
likely to be observed with these offender 
data (i.e., a large portion of the offenders 
will not fail at all during the time observed, 
some will fail once, fewer will fail twice, and 
a handful will fail more often). This type of 
skewed distribution (if present) would violate 
the normality assumptions of ANOVA." 


Advanced Analysis 


Based on the results of the univariate and 
bivariate analysis, more advanced statisti- 
cal techniques (logistic regression, survival 
or Cox regression) are recommended to 
test the relationship between process out- 
comes, intermediate outcomes, and ultimate 
outcomes after controlling for inputs (e.g., 
characteristics, sentence, and resources) 
and process (system activities). The recom- 
mended advanced statistical techniques are 
described below. 


Logistic Regression 


Logistic regression allows one to predict a 
discrete outcome, such as group membership, 
from a set of variables that may be measured 
at any level (interval, ratio, ordinal, or nomi- 
nal) or a mix of levels. Generally, the depen- 
dent or outcome variable is dichotomous, 
such as presence/absence or compliance/non- 
compliance. Logistic regression calculates the 
probability of success over the probability of 
failure, presenting the results in the form of 
an odds ratio. For example, logistic regression 
can tell us the probability that an offender 
will reoffend after controlling for various 
input or process factors. Logistic regression 
also provides knowledge of the relationships 
and strengths among the variables. 


2 ANOVA can be conducted by using the Ana- 
lyze function in SPSS and selecting the Compare 
Means/One-Way ANOVA option. 


'3 This particular statistical technique is not 
available through SPSS. It would require the use 
of Stata statistical software. 


For example, it is possible to use logistic 
regression to test the relationship depicted 
in the logic model between reduced sub- 
stance abuse and minimized criminal activ- 
ity beyond the period of supervision. The 
independent variable would represent a mea- 
sure of substance abuse during supervision. 
This could include a dichotomous variable 
representing presence or absence of relapse 
created from existing measures of substance 
abuse or a continuous variable represent- 
ing the number of relapses during supervi- 
sion. The dependent variable would represent 
criminal activity following supervision. For 
logistic regression, this would be represented 
by a dichotomous variable, such as pres- 
ence or absence of a new offense, substance 
abuse-related or other type of crime, or other 
dichotomous measures of ultimate outcomes 
presented in the logic model. Additionally, it 
is important to include measures of inputs 
as control variables into the regression equa- 
tion, such as characteristics of the offender, 
offense, and sentence. For ease of interpre- 
tation and to best understand the variance 
accounted for by the control variables, it is 
recommended that the inputs or controls be 
entered first in the equation as a block or set 
of control variables. 

Unlike linear regression, the interpreta- 
tion of the coefficient for logistic regression 
(Exp(B)) is more straightforward, represent- 
ing the likelihood of an event occurring. 
Using the substance abuse example, the 
results of logistic regression can tell you how 
much more likely an offender who relapses 
during treatment is of committing a new 
(non-substance abuse-related) offense dur- 
ing and beyond supervision than an offender 
who does not relapse during treatment. With 
logistic regression, it is also possible to con- 
trol for a block or set of variables (inputs, 
process, etc.) in the analysis. 


Survival Modeling 


Survival modeling is recommended when 
you want to examine the relationship among 
independent variables or covariates and the 
time to events of interest, for example, time 
to employment, time to relapse, time to 
recidivism, time to completion of paying res- 
titution, etc. Static models alone are insuf- 
ficient in this situation because they assume 
that a rearrest or other outcome is the same 
regardless of whether it occurred on the first 
or last day of the period of interest. The tim- 
ing of these events, however, is a particularly 


important distinction when considering 
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the public policy and safety implications of 
supervision. Survival modeling or analy- 
sis is an effective statistical technique to 
use when you want to examine the impact 
of time-varying covariates on these events. 
It is a particularly useful technique when 
comparing groups with varying follow-up 
periods. Survival analysis handles time at 
risk by subdividing the follow-up period 
into smaller observation points. At each of 
these points, the proportion of the sample 
that is at-risk for reoffending, for example, 
is used to estimate the probabiXty of surviv- 
ing beyond that point. This method ensures 
that only the characteristics of the popula- 
tion still at risk are used to estimate the 
time until failure, thereby providing a more 
accurate prediction of failure. Additionally, 
survival functions use maximum likelihood 
techniques that can differentiate between 
censored and uncensored cases. That is, sur- 
vival modeling accounts for those cases who 
survive throughout the follow-up period." 
An example of the results of a hypotheti- 
cal comparison (based on “dummy” data) of 
the risk of recidivism among offenders with 
a history of substance abuse who partici- 
pated in individual treatment compared to 
offenders who participated in group treat- 
ment, after controlling for race, age, and risk 
assessment is shown in Attachment 7. Like 
i logistic regression, the results of survival 
modeling are interpreted as the likelihood of 
an event occurring. 


Linear and Multiple Regression 


While logistic regression and survival mod- 
eling are valuable statistical techniques to 
use when the outcome or dependent vari- 
able of interest is a dichotomous variable, 
other regressions should be used when the 
outcome of interest is a continuous vari- 
able. For example, if you want to deter- 
mine whether there is a relationship between 
offenders who demonstrate a reduction in 
substance abuse during supervision and the 
amount of restitution an offender is able 
to pay, linear regression should be used. 
If you want to examine the relationship of 
more than one variable, for example sub- 
stance abuse, employment, and life skills, 
on an outcome (amouni of restitution paid 
or number of new offenses), you should use 
multiple regression. These are all statistical 
techniques that will enable system staff and 
stakeholders to further explore threads of 
the post-conviction supervision logic model 
and identify predictors of success. 


Trend Analysis 


Trend analysis can be used to examine 
changes in outcomes over time for a given 
population, as well as to compare trends in 
outcomes across subgroups. Trend analy- 
sis is often depicted by a graph, like the 
one shown below. This graph depicts hypo- 
thetical trends in the behavior of offenders 
entering post-conviction supervision with 


Sample Trend Lines 
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4 Survival analysis can be performed using the 
Analyze function and selecting the Survival/Cox 
Regression option in SPSS. 


an identified substance abuse problem, spe- 
cifically compliance with restrictive and 
financial conditions (represents percentage 
in compliance) over a 12-month period. 
The graph shows that a greater percent- 
age of offenders were in compliance with 
restrictive compared to financial conditions 
during the 12-month period. However, both 
represent a gradual increase in the percent- 
age of offenders in compliance from January 
to December. To predict future values for 
a variable, for example compliance with 
restrictive conditions beyond December, 
time series analysis is required.'° 


Analysis Assumptions 


It is important to recognize that there are 
two key assumptions underlying the above 
analysis plan. The first assumption is that all 
independent and control variables are treat- 
ed as exogenous variables. That is, a variable 
whose variability is assumed to be deter- 
mined by causes outside the model under 
consideration. No attempt is made to explain 
the variability of an exogenous variable or 
its relations with other exogenous variables. 
Stated another way, none of the independent 
or control variables are said to affect (or to 
be affected by) any of the other variables. It 
is, however, recognized that these variables 
may be correlated with one another. The 
second assumption is that the logic model 
(multistage model) or segments of the model 
(single-stage model) being tested are well 
specified. Under this assumption, for regres- 
sion analysis, it is possible to interpret the 
regression coefficient as the expected change 
in the dependent variable associated with a 
unit change in the variable in question, while 
partialing out the influence of the other 
variables (independent and/or control).'® 

As the logic model and hypothesized 
relationships are further developed and test- 
ed, and data collection refined, additional 
analysis (e.g., differences of proportions, 
interrupted time series) will be considered 
for future assessments. 


'S While you can use SPSS to generate trend 
graphs, a software add-on, SPSS Trends, is 
required to conduct more sophisticated time 
series analysis. 


‘6 When comparing across different populations, 
it is important to use b’s (regression coeffi- 
cients) rather than $’s (standardized regression 
coefficients) because they are more sensitive to 
fluctuations in variances and covariances across 
populations. 
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5. Reporting 


The results-based management framework 
will generate an informative annual research 
report that effectively shows and explains 
changes in outcomes over time and reasons 
or predictors of those changes. While the 
specific layout of the information in the 
report will be driven by the type of analysis 

and research questions to be addressed, a 

general template for the report is presented 

in Attachment 8 [omitted here]. Each section 
of the report is described in detail below. 

I. Overview. This section will contain 
a standard description of the results- 
based management framework, 
including the overall purpose and a 
description of the logic model for post- 
conviction supervision underlying the 
framework. 

II. Description of methodology. This sec- 
tion will contain a detailed descrip- 
tion of the methodology behind the 
framework. In particular, it will include 
the research questions to be addressed, 
data sources and measures examined, 
selection criteria and resulting sample, 
and limitations to sample (e.g., limita- 
tions to generalizability of results to 
entire offender population). 

III. Data quality and reduction process 
and results. This section will include a 
description of the data quality process 
(see Subsection 4.1) and the results 
of the analysis, including percent- 
ages for missing data, data errors, 
and other exclusionary factors that 
result in a reduction of the data set; 
description of data reduction meth- 
ods (see Subsection 4.2) and results; 
and a discussion of any implications 
or limitations to the analysis as a 
result of the data quality assessment 
and data reduction process. 

IV. Model fit. Because of the importance 
of the logic model to the integrity of 
the results, it is important to present 
the model fit results (test of the model 
as a whole) separate from the results 
for specific relationships depicted in 
the model. This section will include a 
description of the analysis conducted 
(e.g., principal component, path anal- 
ysis, structural equation modeling), 
results (goodness of fit statistics), and 
an interpretation of the results. The 
interpretation is important for guiding 
the subsequent analysis and results. 
For the initial annual assessment, a 


complete test of the model may not be 
possible due to unavailable data. 
Post-conviction supervision results. 
This section is critical to understanding 
what aspects of post-conviction super- 
vision are producing desired outcomes 
and which areas need modifications 
or improvements. This section will be 
divided into two parts: demograph- 
ics and outcomes. The demograph- 
ics subsection should begin with the 
results of the univariate or descriptive 
analysis in order to provide the reader 
with a profile or profiles of the offend- 
er population, the system itself (dis- 
tricts, offices, regions, etc.), and other 
contextual factors that are important 
for interpreting the results. Addition- 
ally, it is important to point out that 
these measures will be used as control 
variables in the advanced analysis of 
outcome measures. The results of the 
bivariate analysis should be presented 
next to identify and provide evidence 
for testing specific relationships among 
processes and outcomes. 

The next subsection will present the 
outcomes for post-conviction supervi- 
sion. The logistic regression should be 
presented first with a description of the 
analytic technique (see Subsection 4.3), 
followed by an interpretation of the 
results for each run. This should include 
a description of the control variables, 
the relationship(s) being tested, and the 
findings. Where appropriate, graphics 
should be used to present the findings. 
Next, the results of the linear and mul- 
tiple regressions should be presented. 
Together, the findings from the logis- 
tic and linear/multiple regressions will 
provide important information regard- 
ing the specific relationships depicted 
in the logic model (e.g., does X lead to Y 
when controlling for A and B?). 

The regression results will be fol- 
lowed by a presentation of the sur- 
vival modeling. A description of the 
analytic technique (see Subsection 4.3) 
will be necessary to ensure the reader 
understands why survival modeling is 
being used. In particular, it is impor- 
tant to explain that survival modeling 
is needed to compare outcomes for 
the different entering offender cohorts 
compared annually. Additionally, the 
results of the survival modeling can 
help predict likelihood of success (or 


failure) for an offender in the absence of 
complete follow-on data for all offend- 
ers. This will be important for the first 
several years of the assessments. Again, 
the use of graphs and charts to present 
findings is recommended. The results 
and an interpretation of each run will 
be provided. 

Finally, the results of the trend 
analysis will be presented. This infor- 
mation will provide the reader with 
a description of how outcomes have 
changed over time and for which 
offender populations. Additionally, the 
effectiveness of different treatments or 
interventions can be compared over 
time. Using line graphs is the most 
effective method for presenting trend 
data. An interpretation of the findings 
will be presented. 

VI. Implications and Recommendations 
for Policy and Practice. This is the 
most important section of the report. 
It will begin with a summary of key 
findings and a discussion of any unex- 
pected findings and limitations of the 
data. Next, implications of the find- 
ings for setting priorities and making 
policy, programming, and resourcing 
decisions need to be presented followed 
by specific recommendations supported 
by the results. If possible, suggestions 
for implementing the recommenda- 
tions should also be included. 

Subsequent reports will follow the same 
template but will need to address, if appro- 
priate, the following: 

m Changes to data collection (process, 
sources, measures) and explanation for 
changes 

m Changes to logic model and explanation 
for changes 

m Changes to research questions (or focus 
of analysis) and reasons for changes. 

It is important to include the core sec- 
tions in each annual report but recognize 
the content may change and the format 
may need to be flexible. For example, once 
pretrial information is incorporated into 
the framework, it may be necessary to cre- 
ate a separate report template to present 
these findings or make modifications to the 
existing template in order to combine the 
results into a single report. Feedback should 
be obtained from end-users and changes 
made to the report template as appropriate to 
ensure clear communication of results and 
the usability of information. 
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7. Institutionalizing the 
Framework 


Once the framework has been implemented, 
it is important to ensure it maintains momen- 
tum and continues to receive attention by 
management. Success can be measured by 
the extent to which the results-based man- 
agement framework becomes institutional- 
ized within OPPS. Just as the 2004 Strategic 
Assessment of the Federal Probation and 
Pretrial Services System recommended the 
need to organize, staff, and resource to pro- 
mote mission-critical outcomes, the same 
can be said for the sustainability of the 
framework needed to assess progress toward 
those outcomes. Accomplishing this task 
will require adherence to a predefined yet 
flexible process, strong management sup- 
port, and a commitment of resources. Each 
of these critical factors is described in the 
subsections below. 


7.1 Predefined Process 


To ensure full implementation of the results- 
based management framework, a continuous 
process of assessment, review, and modifica- 
tion is necessary. The specific steps in the 
process are outlined below. 


Ongoing Consistent Assessments 


Ongoing assessment is defined as annual 
extraction, analysis, and reporting on pro- 
cesses and outcomes for offender cohorts 
entering post-conviction supervision during 
specific fiscal years. Over time, the plan is to 
analyze data for three consecutive entering 
cohorts at a time. It is important to adhere 
to the criteria for selection each year in 
order to provide comparisons over time and 
to identify trends in outcomes. It is critical 
to ensure comparison of “apples to apples” 
each year and over time. Additionally, the 
timing of the extraction and reporting must 
be consistent each year.... To ensure ongoing 
assessment and the production of results that 
can be used by managers to make important 
decisions will require a review of existing 
personnel to identify those individuals with 
the necessary skill-set to conduct the assess- 
ments (e.g., knowledge of PACTS, ability to 
apply advanced statistical techniques and 


interpret results as identified in the analysis 
plan, etc.). 

As baseline data become available for the 
various components of the logic model, it will 
be possible to set performance benchmarks 
against the baseline measures. While baseline 
measures are indicators of where the system 
is, benchmarks identify where the system 
needs to be in the future. It is important to use 
data (baseline measures) and actual experi- 
ence when setting benchmarks to ensure they 
are realistic. This process should be a collab- 
orative effort involving input from the field. 
Once benchmarks are set, they should be 
reexamined at least every three years to assess 
progress and determine if modifications are 
needed based on the results. 


Review and Modify (Feedback 
loop) 


The success of the results-based management 
framework relies on quality data, appropri- 
ate analysis and interpretation of results, and 
the utilization of the results. In particular, 
using the results to review and modify the 
framework is important to the longevity of 
the model. 

As policies and practices change, it may be 
necessary for the model to change. As with 
the benchmarks, it is important to review 
the framework design, process, and results at 
least every three years to identify any needed 
changes or modifications to the logic model, 
measures, data systems, selection criteria, 
etc. Additionally, information needs of man- 
agement and the field may shift, requiring 
changes to the framework. This feedback 
loop will help ensure a results-based man- 
agement framework that is responsive to 
changes over time. Any modifications to the 
system need to be vetted through key stake- 
holders, including the field. 


7.2 Management Support 


It is critical for the results-based manage- 
ment framework to be owned by a specific 
organizational unit within OPPS. OPPS may 
wish to consider restructuring the existing 
organizational structure to create a new unit 
focused exclusively on results management 
and the implementation and sustainability of 
the framework. Whatever approach is taken, 
it is important that there be a manager whose 


primary responsibility is the oversight of the 
framework. The manager must also have the 
authority or access to the appropriate lines of 
authority to ensure support of the framework 
and consideration of recommendations to 
decisions regarding policy, programming, 
and resourcing. To the extent possible, it is 
also preferable that the unit is viewed as inde- 
pendent of the other divisions and branches. 
This is important to ensure objectivity, neu- 
trality, and ensure the unbiased reporting of 
findings and recommendations. Addition- 
ally, sustainability of the framework requires 
designated staff with the expertise necessary 
to ensure data quality, analyze data, and 
translate results into practical information. 
Staff also need to be able to make modifica- 
tions to the framework, including revising 
the logic model and identifying and opera- 
tionalizing new measures. 


7.3 Commitment of Resources 


As with any new effort, it requires resources 
to get a process up and running and to 
continue operating over time. A thorough 
assessment of the resource needs for com- 
pleting the remaining implementation tasks 
and sustaining the framework by carrying 
out the lifecycle plan needs to be conducted. 
This assessment should be reviewed annu- 
ally, especially within the first three years as 
changes and modifications requiring addi- 
tional resources are likely. It is clear that 
staff and resources need to be organized to 
support the framework. 


8. Next Steps 


This lifecycle document is intended to serve 
as the primary document that describes the 
content of the results-based management 
framework, the analytic approach to the data, 
implementation process, and plans for insti- 
tutionalizing the framework within OPPS. It 
is important that the information presented 
in this document be reviewed by system 
staff and stakeholders in order to verify the 
information, fill in gaps, review recommen- 
dations, and resolve unanswered questions. 
Once the framework is fully implemented, 
it will provide system staff and stakeholders 
with the information needed to better assess 
performance—and make programming and 
resourcing decisions—based on what the 
federal probation and pretrial services sys- 
tem accomplishes rather than solely on what 
it does. 
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